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Chapter  1 


Introduction 


Decision  making  is  choosing  among  alternatives  to  try  to  maximize  a  given  objective 
function.  Decision  making  is  deeply  embedded  in  every  aspect  of  life.  Most  science,  en¬ 
gineering,  society,  politics,  and  psychology  problems  with  economic  objectives  demand  a 
suitable  methodology  for  problems  of  this  kind. 

The  process  of  choosing  the  most  preferable  alternative  among  all  possible  ones  is  called 
optimization  when  the  problem  involves  a  single  decision  maker.  However,  many  important 
decision  problems  involve  multiple  autonomous  decision  makers  acting  simultaneously.  In 
such  problems,  decision  making  is  substantially  more  complicated  than  in  a  single  agent 
optimization  problem  since  each  decision  maker’s  best  decision  generally  depends  on  what 
the  others  decide.  Game  Theory  is  a  held  of  applied  mathematics  that  addresses  such 
multi-agent  decision  making  problems.  The  decision  making  problem  may  be  either  with 
full  or  uncertain  information.  When  the  information  is  uncertain,  it  may  or  may  not  be 
characterized  by  a  known  probability  distribution. 

This  dissertation  proposes  a  novel  way  of  solving  games  associated  with  non-probabilistic 
uncertainty,  studies  the  properties  of  this  approach,  and  discusses  insights  that  might  be 
applied  to  other  similar  problems.  In  particular,  Chapter  2  studies  a  model  of  optimism  as 
an  augmented  degree  of  rationalization  under  uncertainty  when  agents  are  non-cooperative. 
Chapter  3  studies  a  limitation  of  imperfect  communication  in  establishing  common  knowledge 
and  robust  performance  under  uncertainty  when  agents  are  cooperative. 

Chapter  1  provides  preliminary  background.  The  first  section  describes  probabilistic  and 
non-probabilistic  single  agent  decision  criteria.  The  second  section  defines  a  game  and  Nash 
equilibrium  concepts.  The  third  section  introduces  a  list  of  historically  important  game 
theory  solution  concepts.  Finally  the  last  section  discusses  the  issue  of  common  knowledge 
and  communication. 
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1.1  Single-agent  Decision  Making  with  Uncertainty 


A  mathematical  formulation  of  an  optimization  problem  should  specify  the  following: 
optimization  objective,  optimization  variables,  and  constraint  sets  that  define  feasibility  of 
the  optimization  variables. 

Consider  a  single  agent  problem,  whose  optimization  variable  x  belongs  to  some  feasible 
set  A,  and  with  objective  function  u  :  X  — >  3?.  Then  the  mathematical  optimization  problem 
has  the  form 


argmaxu(x).  (1.1) 

A  variable  x*  G  X  is  called  a  solution  of  the  problem  (1.1)  if 

u(x*)  >  u(x)  for  all  x  G  X. 

A  solution  exists,  for  instance,  if  A  is  a  non-empty  compact  set  and  u{x)  is  continuous. 
Problem  (1.1)  is  called  a  full  information  optimization  problem  since  u  and  X  are  fully 
known. 

However,  in  many  engineering  and  economic  problems,  unknown  variables  affect  the 
objective  function.  Suppose  these  unknown  variables  are  represented  by  6  G  0.  Then 
consider  a  modified  objective  function  u  :  X  x  0  — »  3?,  and  a  modified  optimization  problem: 

argma  xu(x,9).  (1.2) 

x£X 

Generally,  the  solution  of  problem  (1.2)  depends  on  6,  and  cannot  be  solved  without  knowing 

6. 


1.1.1  Alternative  Objectives 

Instead  of  using  the  original  objective  function  u(x,  6),  the  designer  selects  an  alternative 
objective  function  f(x)  that  does  not  depend  on  the  unknown  variable  6.  This  alternative 
objective  function  should  be  chosen  to  capture  the  physical,  and  useful  meaning  of  the 
underlying  context,  and  should  be  computable  with  only  available  information.  We  explain 
the  standard  choices  for  f(x).  They  are  classified  into  probabilistic  and  non-probabilistic 
choices. 
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1.1.2  Probabilistic  Criteria 


Expected  Value 

If  6  is  a  random  variable  (or  a  vector  of  random  variables)  following  a  probability  distri¬ 
bution  P ,  then  one  alternative  objective  is  the  expected  value. 

f(x,P)  ■—  EP[u(x,  9)]. 

This  is  a  popular,  and  well  acceptable  alternative  in  a  vast  number  of  engineering  problems 
if:  (i)  P  is  known  with  some  accuracy,  and  (ii)  the  expected  value  is  a  useful  performance 
metric.  The  second  condition  is  satisfied  when  many  independent  and  identically  distributed 
copies  of  9  occur  in  repeated  instances  of  the  decision  problem  and  one  is  interested  in  the 
average  value  of  the  objective  function. 


Expected  Utility 

The  expected  value  criterion  takes  into  account  only  the  average  value  of  objective  func¬ 
tion.  It  does  not  capture  the  human  player’s  true  valuation  that  may  incorporate  risk 
aversion.  In  order  to  overcome  this  shortfall,  John  von  Neumann  and  Oska  Morgenstern  [41] 
proposed  the  use  of  a  concave  utility  function  T  following  the  school  of  Daniel  Bernoulli  [8]. 
The  alternative  objective  function  is  then  defined  as 

f(x,P,T)  :=EP[T(u(x,6))]. 


Cumulative  Prospect  Criterion 

Experiments  in  Behavioral  Economics  show  that  human  subjects  do  not  behave  according 
to  the  expected  utility  theory.  Allais’  paradox  [2]  and  Ellsberg’s  paradox  [18]  are  prominent 
examples  of  this  observation.  The  cumulative  prospect  criterion  [39]  is  an  attempt  to  describe 
human  behavior,  taking  into  consideration  the  following  major  issues:  (i)  one’s  valuation 
is  relative  to  a  certain  reference  point;  (ii)  one  weights  more  losses  than  gains;  (iii)  one 
overweighs  extreme  but  unlikely  events  and  underweighs  average  events.  In  a  general  form, 
this  criterion  is  an  expected  value  with  transformations  of  the  objective  function  (T)  and  of 
the  probability  distribution  (5)  where  T  and  S  are  chosen  in  accordance  with  the  behavioral 
observations  above.  Thus,  the  criterion  corresponds  to  the  following  function: 

f(x,  P,T,S)  :=Es{P)[T(u(x,0))}. 
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1.1.3  Non-probabilistic  Criteria 


Any  probabilistic  criteria  assumes  a  prior  knowledge  of  the  probability  distribution  P  of 
the  uncertainty  9.  However,  this  assumption  may  not  be  appropriate  and,  in  some  situations, 
one  may  know  only  the  set  0  of  possible  values  of  6.  This  subsection  covers  a  few  popular 
non-probabilistic  criteria  based  on  only  that  information. 


Pessimism  Criterion 

The  pessimism,  or  robust,  criterion,  considers  the  worst  case  scenario.  Treating  the 
uncertainty  as  adversary,  one  seeks  to  maximize  the  worst  case  performance.  That  is,  the 
objective  is  defined  as 

f(x,Q):=minu(x,9).  (1.3) 

A  solution  of  (1.3)  might  be  appropriate  in  some  applications. 


Optimism  Criterion 

This  criterion  assumes  that  the  uncertainty  is  favorable.  It  corresponds  to  the  following 
objective  function: 

f(x,Q)  :=  rna xu(x,9). 

We  combine  this  criterion  with  pessimism  later. 


Regret  Criterion 

The  regret  is  the  loss  in  performance  clue  to  the  uncertainty.  Define 

x*(9)  =  argmaxa(a;,  9 ;  9) 

x&X  v  ' 


Then  the  alternative  objective  function  is  defined  as  the  minimum  regret  defined  as  follows: 

f(x,  0)  :=  min{n(x,  9)  —  u{x*{9),  9)} 

Instead  of  the  difference,  an  alternative  definition  of  the  regret  is  the  ratio  u(x*(6),  9)/u(x,  9). 


Laplace  Criterion 

Pierre-Simon  Laplace  proposed  to  use  the  uniform  distribution  over  uncertainty  when 
no  other  information  is  available.  This  approach,  called  Laplace’s  principle  of  indifference, 
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corresponds  to  the  following  objective  function: 


f(x,U)  :=  Eu[u(x,  9)\ 


where  U  is  the  uniform  distribution  over  all  possible  0’s. 


Hurwicz  Criterion 


Hurwicz  criterion  [22]  is  a  generalization  of  pessimism  and  optimism,  with  the  control 
parameter  a  G  [0,1]: 


fix ,  a,  0)  :=  a  ma xu(x,  9)  +  (1  —  a)  min  u(x,  9) 
e»e©  flee 


One  caveat  of  this  criterion  is  that  the  choice  of  a  is  arbitrary. 


1.1.4  Criteria  for  Super-Problem 

Another  modeling  approach  is  to  combine  probabilistic  and  non-probabilistic  criteria  in  a 
super-problem.  A  super-problem  is  defined  by  constructing  a  two-fold  alternative  objective. 
The  inner  alternative  objective  uses  the  expected  value  criterion  over  a  probability  distribu¬ 
tion  P  over  9.  The  outer  alternative  objective  considers  P  as  uncertain  within  a  collection  of 
probability  distributions  V,  and  then  applies  one  of  the  above-mentioned  non-probabilistic 
criteria. 

As  one  example,  a  pessimism  criterion  for  expected  value  when  one  knows  the  underlying 
distribution  belongs  to  V  is 

f(x,V)  :=  min EP[u(x,9)\. 

Similarly,  one  can  define  other  alternative  objectives  for  optimism,  regret,  Laplace,  and 
Hurwicz. 


1.2  Complete  Information  Games 

Many  decision  problems  involve  more  than  one  decision  maker.  Decision  makers  are 
autonomous,  rational,  and  independent  from  each  other  in  making  decisions,  but  they  are 
tightly  connected  to  each  other  in  the  sense  that  the  reward  of  one  agent  depends  on  the  oth¬ 
ers’  decisions.  We  use  Game  Theory  to  address  such  multi-agent  decision  making  problems. 
Formally,  a  strategic  form  game  T  is  defined  as  a  triplet 

T={N,X,u)  (1.4) 
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where  J\f  is  a  set  of  decision  makers  (or  agents,  or  players),  A  :=  kb  A,  is  a  product  space 
of  each  agent  i’s  strategy  space  A*,  and  u  :=  ( tq )  is  a  list  of  each  agent’s  objective  function 
Ui  :  Xi  — >  3?. 

Together  with  (1.5),  traditional  game  theory  further  makes  the  following  assumptions: 

Axiom  1. 

1.  Instrumentally  rational  individual  action; 

2.  Common  knowledge  on  rationality; 

3.  Consistent  alignment  on  beliefs. 

This  dissertation  is  going  to  address  a  few  issues  when  those  assumptions  are  challenged 
in  some  game  situations. 

An  agent  i  is  said  to  be  instrumentally  rational  if  she  has  a  well-defined  preference 
ordering  that  is  represented  by  Ui.  The  game  is  said  to  have  common  knowledge  on  rationality 
if  each  agent  is  instrumentally  rational,  each  agent  knows  that  each  agent  is  instrumentally 
rational,  each  agent  knows  that  each  agent  knows  that  each  agent  is  instrumentally  rational, 
ad  infinitum.  Also,  each  agent  knows  the  set  of  strategies  and  the  utility  function  of  every 
other  agent.  Consistent  alignment  on  beliefs  means  no  agent  expects  that  an  agent  with  the 
same  information  can  develop  a  different  thought  process  [21]. 

There  exists  a  rich  literature  studying  solution  concepts,  their  existence,  uniqueness, 
refinement,  stability,  and  convergence  of  algorithms  in  many  different  contexts.  We  focus 
attention  to  the  decision  criteria. 

First  consider  a  full  information  game  where  all  agents  are  aware  of  T  and  Axiom  1. 

Definition  1  (Nash  Equilibrium  in  Full  information  game).  Let  AT)  be  the  set  of  probability 
measures  on  X For  a  G  IhAAj,  one  defines 


Ui(a )  =  Ea(ui(X)) 


where  Ea  is  the  expectation  when  X  has  the  distribution  a. 
Then  a*  G  fljAAj  is  called  a  Nash  equilibrium  if,  for  all  i, 


Ui(cr*)  >  Ui(<7i,aC)  for  all  Oi  G  A  A). 


When  a*  assigns  probability  one  to  some  x*  G  II,  A,,  the  equilibrium  is  called  a  pure  Nash 
equilibrium.  Otherwise,  it  is  called  a  mixed  strategy  Nash  equilibrium. 
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To  find  a  Nash  equilibrium,  every  agent  i  independently  identifies  her  best  response  a* 
to  the  choices  of  the  other  agents.  That  is,  she  determines 

a*  £  arg  max  ul(o%,  a*-). 

1  ai&AXi 

This  solution  assumes  that  all  the  agents  correctly  compute  the  others’  best  strategy, 
which  requires  Axiom  1.  This  search  for  the  fixed  point  is  discussed  by  Lismont  and  Mongin 
[27],  and  argued  as  an  interactive  rationality  by  Aumann  [5].  Nash  has  shown  that  any  game 
with  a  finite  number  of  players  with  finite  strategy  sets  has  at  least  one  mixed  strategy  Nash 
equilibrium  [33] .  When  the  utility  functions  are  diagonally  concave  and  the  sets  of  strategies 
are  compact,  Rosen  [35]  has  shown  that  the  game  has  a  pure  Nash  equilibrium. 

When  the  Nash  equilibrium  is  not  unique,  the  meaning  of  each  equilibrium  becomes 
questionable  and  the  selection  among  equilibria  requires  some  care.  The  next  sections  review 
some  of  those  issues. 


1.2.1  Iterated  Strict  Dominance 

Iterated  strict  dominance  is  a  survival  strategy  solution  concept,  and  introduced  by  Luce 
and  Raiffa  [28].  The  survival  process  is  infinitely  repeated.  More  precisely,  let  X%  be  agent 
i’s  pure  strategy  space  and  let  AA*  be  her  mixed  strategy  space. 


Definition  2.  Initialize  sets  Sf  :=  A*  and  E°  :=  AA*.  Recursively  define 

S™  :=  {s*  G  S^1  |  $<Ji  £  E"_1  s.t.  Uj(cTj,  s_j)  >  ufis^s-f)  for  all  s_i  G  A””1} 


Finally  we  define 


E"  :=  {(Ti  £  E i  |  <Ji(si)  >  0  only  if  Si  £  S ?}. 


s?  ■■=  n s t 

n= 0 


S°°  :=  {(Ti  |  s.t.  ufiafs-i)  >  ufiai,  s_i),  for  all  s_i  e  S™,  af  a.i  £  AS”} 


Then  S°°  and  E°°  are  agent  i ’s  pure/mixed  strategies  that  survive  iterated  strict  dominance. 


1.2.2  Focal  Points 

This  is  a  view  argued  by  Shelling  [37]  that  the  strategic  form  game  definition  washed 
away  too  much  information  by  which  otherwise  agents  in  a  coordination  game  may  be  able 
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to  coordinate  better  than  Nash  equilibrium.  Shelling’s  one  experiment  was  when  two  players 
with  no  communication  are  asked  to  meet  at  New  York  City  on  a  hxed  day,  but  not  instructed 
about  neither  a  time  nor  a  location,  most  of  the  participants  choose  Grand  Central  Station 
at  noon.  A  focal  point  is  a  concept,  which  allows  a  multitude  of  different  mathematical 
definitions,  about  a  feature  of  such  a  coordination  game  that  provides  to  a  combination  of 
actions  a  distinction  from  others.  While  universally  acknowledged  by  game  theorists,  it  has 
not  been  assimilated  info  formal  game  theory  [14], 


1.2.3  Selection  Theory  (Trembling  Hand) 

Shelten’s  equilibria  selection  theory  [38]  is  a  method  for  selecting  among  multiple  Nash 
equilibria.  This  theory  considers  the  possibility  of  small  operational  error,  called  a  trembling 
hand.  Shelten  argues  that  agents  should  reject  Nash  equilibria  when  the  reward  deteriorates 
significantly  if  an  agent  makes  a  small  error. 


1.2.4  Correlated  Equilibrium 

The  correlated  equilibrium  proposed  by  R.  Aumann  [3]  is  a  generalized  Nash  equilibrium 
that  introduces  a  preplay  discussion  and  a  public  signal  from  nature  that  obeys  a  common 
prior.  During  the  preplay  discussion,  agents  agree  to  a  correlating  device.  A  correlating 
device  is  a  triple 

where  12  is  the  sample  space  of  the  device,  P  is  a  probability  measure  on  Q,  and  Hi  is  agent 
i’s  information  partition  on  12.  The  correlating  device  notifies  the  agent  i  of  /q(cu)  G  Hi 
upon  c o  occuring.  Recall  that  A)  is  i’s  pure  strategy  space.  Let  C  be  the  collection  of  maps 
n  :  Hi  ->  A*. 

Definition  3  (Correlated  Equilibrium).  A  correlated  equilibrium  r*  =  (r* )  E  IRC  relative 
to  the  correlating  device  (12,  {Hi},  P )  is  a  Nash  equilibrium  in  r-strategies.  That  is,  for  all  i, 

E  [ui(r*(hi(u)),rP(h-i(u)))]  >  E  [«i(ri(/ii(o;)),r!.i(/i_i(a;)))]  for  all  n  E  Ct 

The  set  of  correlated  equilibria  is  at  least  as  large  as  the  set  of  mixed  strategy  Nash 
equilibria  since  the  coordination  signals  can  correspond  to  independent  randomizations  of 
the  strategies  by  the  different  agents  [19].  However,  in  many  games,  if  such  a  coordination 
is  possible,  it  can  improve  the  agents’  utility. 


1.2.5  Rationalizable  Strategies 


Rationalizable  strategies  are  all  the  strategies  that  a  rational  player  could  play.  This 
concept  is  complementary  to  iterated  strict  dominance,  and  introduced  by  Bernheim  [7] 
Pearce  [34], 


Definition  4.  The  rationalizable  strategies  for  agent  i  are  IJ^Lo  w^ere  for  eac h  i  one 
defines  recursively 

=  A!" 


Tfi  :=  {<Ji  G  E”  1  |  3<j_i  G  1  s.t.  >  ufiafa^)  for  all  cr-  G  E"  x} 


In  general,  the  set  of  rationalizable  strategies  is  contained  in  the  set  that  survives  iterated 
strict  dominance.  In  two  agent  games,  they  are  identical. 


1.2.6  Rational  Expectation 

Similarly  to  Shelling’s  focal  point  motivation,  R.  Aumann  [5]  argues  that  the  strategic 
form  game  does  not  carry  enough  information.  He  redefines  a  game  as  game  situation 
Q  =  (r,  fl),  where  T  is  a  strategic  form  game  as  we  defined  earlier,  and  f3  is  called  a  belief 
system  that  defines  a  player  strategy  for  her  each  type,  and  represents  how  each  type  player 
believes  how  others  will  behave.  A  player’s  type  uniquely  determines  the  whole  hierarchy  of 
her  beliefs. 

The  expectation  of  a  player  is  her  expected  payoff  with  respect  to  her  belief  given  her 
type.  If  the  belief  system  agrees  to  a  consistent  common  prior,  and  if  the  strategy  the  type 
prescribes  maximizes  the  player’s  expected  payoff,  then  the  players  expectation  is  called 
rational.  Still,  in  general,  there  exist  infinitely  many  consistent  belief  systems  which  can 
be  paired  with  T,  though  a  game  situation  is  defined  with  a  single  consistent  belief  system. 
Now  consider  a  view  from  the  opposite  direction:  suppose  nature  governs  a  random  event 
and  provides  a  private  signal  to  each  player  as  a  function  of  that  event.  Given  that  signal,  a 
player  has  a  posterior  view  on  how  others  will  behave.  Carefully  designed,  this  set  of  signals 
provides  a  coordination  among  players  from  which  no  one  wants  to  unilaterally  deviate.  This 
is  precisely  the  definition  of  a  correlated  equilibrium,  for  which  a  player  can  compute  her 
conditional  payoff.  Aumanns  contribution  is  to  show  a  relation  between  rational  expectations 
in  (T,  /?)  and  conditional  payoffs  of  correlated  equilibria  in  a  game  which  is  closely  related  to 
T.  This  closed  related  game  is  2T  to  be  defined  shortly. 

We  first  redefine  T  as  follows:  A  strategic  form  game  T  is  defined  as  a  triplet 

T=(Af,C,u)  (1.5) 


9 


where  J\f  is  a  set  of  decision  makers  (or  agents,  or  players),  £  :=  II j£,:  is  a  product  space 
of  each  agent  i’s  finite  strategy  list  £,.  Let  {£*}  denote  the  largest  set  of  elements  of  £,:, 
that  is,  a  set  of  i’s  feasible  strategies  without  redundancy,  u  :=  ( Ui )  is  a  list  of  each  agent’s 
objective  function  Ui  :  {£}  — >  5L 

A  doubled  list  2£,  is  £;  x  {1,  2},  where  the  hrst  copy  and  the  second  copy  of  a  strategy 
are  identical.  Finally  define  a  doubled  game 

2r  =  (J\f,  2£,  u)  (1.6) 

where  u  :  {2£}  — >  3?  giving  the  same  payoff  as  in  u  of  T,  no  matter  which  copy  of  a  strategy 
is  used.  Note  {2£}  =  {£}. 

The  introduction  2 V  is  a  trick  to  assign  any  correlated  equilibrium  that  will  lead  to  any 
consistent  belief  system  in  Q,  by  splitting  weights  of  correlated  equilibrium  distribution. 

Theorem  1.  The  rational  expectations  in  a  game  (T,  ft)  are  precisely  the  conditional  payoffs 
to  correlated  equilibria  in  the  doubled  game  2 T. 

Aumann  provides  two  intuitions  (See  [5]  Section  VI)  underlying  this  theorem: 

(i)  The  common  prior  probability  of  a  consistent  belief  system  ft  in  a  game  T  is  essentially 
the  same  thing  as  a  correlated  equilibrium  of  a  game  T b  closely  related  to  T  -  that  in  which 
each  strategy  of  each  player  appears  as  many  times  as  there  are  types  that  play  that  strategy 

in  ft. 

(ii)  The  conditional  expectation  of  a  strategy  in  a  correlated  equilibrium  does  not  change 
when  other  strategies  that  are  identical  are  amalgamated.  Amalgamation  is  replacing  iden¬ 
tical  strategies  by  a  single  strategy,  by  adding  prior  probabilities  over  added  strategies. 

Then,  define  2 T  by  amalgamating  in  T#  all  identical  strategies  into  two.  By  (ii),  a 
conditional  correlated  equilibrium  payoff  in  T b  for  a  particular  strategy  is  also  a  conditional 
correlated  equilibrium  payoff  in  2 T.  Together  with  (i),  this  yields  above  theorem. 

The  goal  of  this  solution  concept  is  somewhat  different  from  traditional  ones  in  two 
ways.  First,  instead  of  focusing  on  the  recommendation  to  agents,  this  formulation  focuses 
on  what  rational  players  should  expect  to  get,  or  the  value  of  the  game.  It  is  named  as 
rational  expectation.  Second,  as  far  as  the  recommendation  is  concerned,  it  does  not  deal 
with  ‘equilibria’.  It  simply  suggests  an  agent  to  do  single-agent-like  maximization  against 
one’s  subjective  probabilities  over  others’  strategies. 

One  should  note  also  that  the  level  of  consistency  Aumann  requires  is  weak.  The  belief 
system  is  required  to  be  consistent  to  a  common  prior  that  exists,  but  it  is  not  required  to 
be  consistent  to  feasibility  of  outcomes.  Rational  expectations  can  be  outside  of  the  convex 
hull  of  pure  strategy  payoffs.  This  possibility  is  called  the  inconsistency  of  assessment. 
An  existence  of  a  common  prior  is  not  sufficient  to  guarantee  the  feasibility  of  rational 
expectations. 
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1.3  Incomplete  Information  Game 


As  in  the  case  of  a  single-agent  optimization  problem,  many  games  of  interest  have 
incomplete  information.  The  uncertainty  game  we  consider  is  one  where  each  agent  has 
her  own  private  information  which  is  not  known  to  the  others,  and  all  agents  know  this  as 
common  knowledge.  Agent  i’s  private  information  influences  i’s  utility,  best  response,  and 
ultimately  the  solution  of  the  game.  Thus,  in  contrast  to  the  full  information  game  case, 
an  agent  does  not  know  the  exact  preference  ordering  of  other  agents.  We  summarize  the 
private  information  of  agent  i  by  a  parameter  e  0,. 


Bayesian  Nash  Equilibrium 

The  notion  of  Bayesian  Nash  equilibrium  is  Harsanyi’s  proposal  to  model  and  under¬ 
stand  an  incomplete  information  game.  This  model  assumes  that  there  exists  a  common 
prior  probability  distribution  P  with  which  nature  randomly  chooses  each  agent’s  private 
information  or  type.  The  utility  function  of  of  agent  %  Ui  is 

Ui  :  A  x  0  ->  K. 

The  probability  distribution  of  6  =  (Ofi  G  11,0,  is  P  and  (A /",  A,w,  P,  0)  is  common  knowl¬ 
edge. 

Definition  5  (Mixed  Strategy  Bayesian  Nash  Equilibrium).  We  define  A  A,  and  ufia,  dfi  as 
before.  Then  cr*  e  II,  A  A,  is  called  a  mixed  strategy  Bayesian  Nash  equilibrium  if,  for  all  i, 

a*(9i)  e  arg  max  P[rq(a;,  crA(6L,),  6>)|6>j]. 

criGAXi 


1.3.1  Motivation  for  Non-Probabilistic  Solution  Concept 

The  concept  of  Bayesian  Nash  equilibrium  requires  some  strong  assumptions:  (i)  existence 
of  common  prior  that  governs  nature’s  move;  (ii)  common  knowledge  on  the  prior;  (iii)  error- 
free  observation  of  nature’s  selection  of  private  information.  As  in  the  case  of  a  single-agent 
optimization  problem,  there  are  many  situations  where  these  assumptions  are  not  satisfied. 

Many  researchers  have  explored  non-Bayesian  models  of  uncertainty.  Knight  [23]  raised 
questions  about  the  suitability  of  probabilistic  characterizations  of  uncertainty  in  some  sit¬ 
uations.  Allais’  parodox  [2]  and  Ellsberg’s  paradox  [18]  are  examples  of  situations  where 
decision  makers  violate  the  expected  utility  hypothesis.  More  recently,  Binmore  [12]  and 
Lee  and  Leroux  [25]  explored  more  philosophical  questions  on  inaccuracy,  arbitrariness,  and 
illegitimacy  of  Bayesianism  in  games.  The  behavioral  sociology  literature  also  reports  that 
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Bayesian  strategies  fail  to  occur  in  some  real  world  games  [40].  A  few  noteworthy  experi¬ 
ments  demonstrate  a  certainty  effect  where  people  prefer  less  uncertain  events,  a  refection 
effect  where  people  respond  differently  to  gain  and  loss  [6],  and  preference  reversals  where 
people  show  different  valuations  when  they  buy  and  when  they  sell  the  same  lottery  [13]. 
See  also  [26]  for  a  related  discussion  of  the  modeling  of  uncertainty  through  a  family  of 
probability  distributions. 

For  such  situations,  one  may  consider  one  of  the  non-probabilistic  criteria  developed  in 
the  preceding  section:  pessimism,  optimism,  regret,  etc.  In  a  single-agent  decision  making 
problem,  one  is  free  to  choose  a  criterion  as  long  as  it  captures  the  physical  meaning  of  the 
problem  context.  In  multi-agent  decision  making  problem,  however,  that  is  not  sufficient. 
The  choice  should  be  strategic. 

Let’s  take  pessimism  for  example.  Suppose  an  agent  cares  the  robust  performance  (for 
herself).  However,  if  this  fact  is  known  to  other  agents,  they  can  exploit  this  and  may  strate¬ 
gically  select  the  optimism  criterion  in  order  to  achieve  higher  performance  for  themselves. 
Knowing  this  possibility,  one  should  be  careful  in  choosing  the  pessimism  criterion. 

More  generally,  it  should  be  questioned  if  it  is  purely  in  the  best  interest  of  an  agent 
to  follow  a  particular  decision  criterion.  Based  on  this  argument,  the  model  in  Chapter  2 
considers  that  the  choice  of  decision  criterion  is  strategic.  The  proposed  model  has  a  form 
similar  to  Hurwicz’s  criterion  in  single-agent  decision  making,  but  now  the  parameter  a  will 
be  chosen  strategically. 


1.3.2  Common  Knowledge 

Since  optimally  coordinated  outcome  of  the  game  cannot  be  worse  than  uncoordinated 
outcome,  in  some  game  situations,  such  as  cooperative  games,  agents  are  willing  to  commonly 
share  their  private  information  so  as  to  achieve  the  most  efficient  outcome  for  all.  For  this 
purpose,  communication,  or  message  exchange  is  the  most  natural  way  of  information  sharing 
among  autonomous  agents. 

Agents  are  said  to  have  information  consensus  if  all  know  that  information.  Agent  are 
said  to  have  common  information  (knowledge)  if  all  know  the  information,  all  know  that  all 
know  the  information,  all  know  that  all  know  that  all  know  the  information,  ad  infinitum. 
To  reach  a  coordinated  actions,  it  is  obvious  that  agents  need  to  reach  common  knowledge 
first. 

If  the  communication  is  perfect  and  error-free,  once  a  message  sender  sends  a  message  to 
a  receiver,  the  former  is  very  sure  the  latter  gets  the  message.  If  the  communication  is  error- 
prone  however,  no  matter  how  small  the  chance  of  error  is,  one  needs  an  additional  protective 
mechanism  to  make  sure  of  the  message  delivery.  Consider  any  in-band  mechanism  (using 
message  exchange  as  a  way  of  protection)  of  that  kind.  In  1978,  J.  Gray  maintained  there 
exists  no  such  mechanism  using  finite  number  of  message  exchanges.  His  note  is  simple  and 
worth  to  quote  here  as  it  is. 
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The  Generals  Paradox  [20] 

“There  are  two  generals  on  campaign.  They  have  an  objective  (a  hill)  which  they  want  to 
capture.  If  they  simultaneously  march  on  the  objective  they  are  assured  of  success.  If  only 
one  marches,  he  will  be  annihilated.  The  generals  are  encamped  only  a  short  distance  apart, 
but  due  to  technical  difficulties,  they  can  communicate  only  via  runners.  These  messengers 
have  a  flaw,  every  time  they  venture  out  of  camp  they  stand  some  chance  of  getting  lost 
(they  are  not  very  smart.)  The  problem  is  to  find  some  protocol  which  allows  the  generals  to 
march  together  even  though  some  messengers  get  lost.  There  is  a  simple  proof  that  no  fixed 
length  protocol  exists:  Let  P  be  the  shortest  such  protocol.  Suppose  the  last  messenger  in 
P  gets  lost.  Then  either  this  messenger  is  useless  or  one  of  the  generals  doesn’t  get  a  needed 
message.  By  the  minimality  of  P,  the  last  message  is  not  useless  so  one  of  the  general  doesn’t 
march  if  the  last  message  is  lost.  This  contradiction  proves  that  no  such  protocol  P  exists.” 

We  can  construct  a  discrete  knowledge  hierarchy  as  the  m,  the  number  of  message  ex¬ 
changes,  increases.  We  expect  at  infinity  m,  generals  reach  common  knowledge.  Intuition 
says  if  m  <  oo  is  large,  generals  are  likely  to  reach  common  knowledge.  The  Generals  Para¬ 
dox  asserts  however,  that  no  matter  how  large  m  is,  if  it  is  finite,  generals  never  reach  the 
common  knowledge. 

The  cost  of  failure  to  reach  common  knowledge  is  high  (death)  in  this  problem.  Thus, 
we  can  deduce  that  if  we  define  an  agent’s  strategy  as  a  function  of  m,  or  s(m),  then 

lim  s(m)  ^  s(cx)). 

m— >oo 

In  other  words,  the  sequence  of  strategies  does  not  converge  as  the  knowledge  hierarchy 
builds  up.  (An  essentially  same  observation  is  made  by  A.  Rubinstein  in  1989  in  incomplete 
information  game  and  Bayesian  Nash  equilibrium  context  [36].)  This  observation  triggered 
sequels  of  many  research  interests  that  attempt  to  understand  the  topological  properties  of 
belief  spaces.  Good  pieces  of  work  along  this  line  include  [17,  32,  16,  1,  30]. 

In  Chapter  3,  we  study  a  cooperative  game  with  private  information  with  robust  decision 
criterion.  Not  only  we  provide  an  impossibility  result  aligned  with  the  General  Paradox,  but 
also  we  provide  a  way  of  coordination  to  achieve  guaranteed  efficient  outcome. 
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Chapter  2 


Non-cooperative  Game  with 
Non-probabilistic  Uncertainty 


This  chapter  studies  one-shot  two-player  games  with  non-Bayesian  uncertainty.  The 
players  have  an  attitude  that  ranges  from  optimism  to  pessimism  in  the  face  of  uncertainty. 
Given  the  attitudes,  each  player  forms  a  belief  about  the  set  of  possible  strategies  of  the  other 
player.  If  these  beliefs  are  consistent,  one  says  that  they  form  an  uncertainty  equilibrium. 
One  then  considers  a  two-phase  game  where  the  players  first  choose  their  attitude  and  then 
play  the  resulting  game.  The  chapter  illustrates  these  notions  with  a  number  of  games  where 
the  approach  provides  a  new  insight  into  the  plausible  strategies  of  the  players. 


2.1  Introduction 

We  study  a  one-shot  non-cooperative  game  of  two  rational  players  with  non-probabilistic 
information  uncertainty.  Specifically,  we  assume  that  the  set  of  possible  values  of  the  un¬ 
certain  parameter  is  known,  but  that  no  prior  distribution  is  available.  Thus,  instead  of  the 
more  traditional  Bayesian  approach  where  user  maximize  their  expected  reward,  here,  play¬ 
ers  have  an  attitude  that  models  their  risk- aversion.  An  optimistic  (respectively,  pessimistic) 
player  assumes  that  the  other  player  will  choose  a  strategy  that  is  beneficial  (respectively, 
detrimental)  to  her.  A  moderately  optimistic  player  makes  an  intermediate  assumption. 
However,  in  contrast  with  other  approaches,  we  assume  that  the  players  choose  their  attitude 
by  analyzing  the  consequences  of  their  choice,  instead  of  assuming  that  their  risk-aversion  is 
pre- determined. 

Different  players  may  have  a  different  objective  in  the  face  of  uncertainty.  Some  popular 
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choices  include  minimax  regret,  maximin  pessimism  or  maximax  optimism.  Instead  of  a 
fixing  a  player’s  optimization  objective,  we  allow  a  rational  player  to  choose  somewhere 
between  worst  case  and  best  case.  We  parametrize  a  player’s  subjective  decision  criterion  as 
a  convex  combination  of  pessimism  and  optimism  with  parameter  7r,  and  we  call  it  a  player’s 
attitude  against  uncertainty.  As  we  explained  in  the  Introduction,  Hurwicz  (1951)  [22] 
proposed  a  similar  convex  combination  criterion  for  a  single  agent  decision  making  problem. 
However,  one  crucial  aspect  of  this  study  is  that  the  attitude  is  not  fixed  ahead  of  time. 
Instead,  the  players  choose  their  attitude  strategically.  For  instance,  the  players  may  realize 
that  the  only  rational  attitude  is  to  be  optimistic  because  it  is  the  only  Nash  equilibrium  in 
a  two-stage  game  where  the  first  stage  is  to  choose  the  attitude.  More  generally,  there  may 
be  a  set  of  attitudes  for  each  player  from  which  it  is  not  rational  to  deviate  unilaterally.  In 
such  a  case,  the  model  provides  some  information  about  how  to  behave  rationally  in  the  face 
of  uncertainty. 

Section  2.2  develops  a  model  of  two  non-cooperative  players  with  non-probabilistic  pa¬ 
rameter  uncertainty,  and  introduces  the  notions  of  attitude  and  uncertainty  equilibrium. 
Section  2.4  presents  examples  for  which  the  approach  provides  a  new  insight  into  the  strate¬ 
gies.  Section  2.3  proves  the  existence  condition  of  an  uncertainty  equilibrium  and  relates  it 
to  a  Nash  equilibrium  of  the  corresponding  full  information  game.  Section  2.5  proves  that 
at  least  one  player  should  not  be  pessimistic.  Section  2.6  concludes  the  chapter. 


2.2  Uncertainty  Equilibrium 

The  section  defines  the  model  of  game  with  uncertainty.  It  then  introduces  the  notion 
of  uncertainty  equilibrium  for  players  that  have  specific  attitudes.  The  section  then  defines 
the  two-phase  game.  First,  we  define  a  reference  game  with  full  information. 

Definition  6  (Certainty  Game  (f0). 

Two  non-cooperative,  selfish  and  rational  players  i  —  1,2  and  j  =  3  —  i  play  a  game  with 
strategies  x  :=  ( xi,x2 )  G  Xlo  x  X2j0,  where  Xifi  C  R  is  i’s  closed  bounded  strategy  interval. 
Player  i  has  type  6i  G  M.  The  reward  of  player  i  is  real-valued  Ui(x,6i).  This  is  a  full 
information  game  with  common  knowledge  about  ui;  Xi)0,  and  9,  for  all  i.  We  assume  that 
this  game  is  such  that  Ui(x,9i)  is  continuous  in  ( x,9i ),  has  a  unique  maximizer  Xi(xj,  Of)  for 
every  ( Xj,9i ),  and  has  at  least  one  pure  Nash  equilibrium. 

We  now  consider  the  game  with  uncertainty  about  the  opponent’s  type. 

Definition  7  (Uncertainty  Game  Q). 
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Player  i  knows  her  own  true  type  9i  but  only  that  Oj  G  Qj  for  j  —  3  —  i,  where  Qj  is  a  closed 
bounded  real  interval,  and  this  is  common  knowledge.  To  avoid  triviality,  Qj  is  assumed  to 
be  of  non-zero  length  unless  specified  otherwise. 

The  goal  of  the  chapter  is  to  study  the  notion  of  equilibrium  in  such  a  situation.  Our 
approach  is  non-Bayesian.  That  is,  we  do  assume  neither  a  known  posterior  distribution  of 
the  parameters  nor  the  existence  of  a  common  prior  distribution. 

We  start  with  a  simple  approach  to  refine  the  set  of  rational  strategies.  Assume  that  it 
is  known  that  player  i  chooses  Xi  G  X^.  It  may  be  reasonable  to  believe  that  player  j  will 
choose  a  strategy  Xj(xi,  Of)  for  some  Xi  G  Xt.  This  best  response  x3  is  defined  in  the  strategy 
space  XjtQ.  Since  player  i  does  not  know  9j,  she  may  then  believe  that  player  j  chooses 
Xj  G  <t>j(Xi)  where 


(/>j(Xi)  :=  { Xj(xi,9j )  |  Xi  G  Xi,9j  G  Qj}.  (2.1) 

These  considerations  lead  to  the  following  definition. 

Definition  8.  The  sets  X[,  X\  are  consistent  if  Xj  =  <f>j(X\)  for  i  —  1,2  and  j  =  3  —  i. 

Since  the  best  response  Xi{xj,9f)  is  defined  in  Xi:0,  <f>i(Xj)  for  any  Xj  is  also  defined  in 
Xi)0.  Thus  for  consistent  sets  X\,X\, 

Xj  C  Xit0,  for  all  i. 

The  consistent  sets  form  a  product  space  of  strategies  beyond  which  no  rational  player 
plays.  Although  the  sets  Xj  are  smaller  than  the  original  strategy  spaces  Xio,  they  may 
be  large  and  provide  little  recommendation  on  the  strategies  the  players  should  choose. 
Moreover,  one  may  question  whether  the  players  will  choose  strategies  in  the  consistent  sets. 


2.2.1  Optimism  and  Pessimism 

We  now  develop  a  different  formulation  of  the  game  that  considers  the  attitudes  n  = 
(tti ,  7T2)  G  [0,  l]2  of  players  in  the  face  of  uncertainty. 

Definition  9  (Game  with  Attitudes  n:  Q( 7r)). 

If  it  is  known  that  player  j  chooses  x3  G  Xj,  then  player  i  chooses  x^  G  Xio  to  maximize 

fi(xi ,  Xj,  9i ,  1)  :=  max  u.fx,  9f) 

xjeJCj 
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if  she  is  optimistic  and  to  maximize 


Xj,  6i,  0)  :=  min  Ui(x,0f) 

xjeXj 

if  she  is  pessimistic.  In  general,  for  0  <  i q  <  1,  if  player  i  has  attitude  i Ti,  she  chooses 
ay  G  Xto  to  maximize 


f i (U  7  Xj,  0i,  irf) 


:=  7 q  max  Ui(x,  Of)  +  (1  —  7q)  min  ufx,  Of).  (2.2) 

x  j  £  X j  x  j £  X j 

We  primarily  study  a  discrete  attitude  space  7 q  G  {0, 1},  and  later  use  the  continuous 
attitude  space  7 p  G  [0, 1]  in  developing  the  notion  of  robust  attitude. 

Designate  by  r^Xj^^-nf)  the  set  of  maximizers  of  fi(xi,  Xj,  0i,  nf).  That  is, 

ri(Xj,0i,Trf)  :=  arg  max  fi(xi,Xj,0i,  irf).  (2.3) 

Xi&Xi:0 

Since  player  j  does  not  know  0i,  she  assumes  that  ay  G  ifi(Xj ;  irf)  where 

^Pi{Xj',7rf)  (J  n(Xj,0i,  7Ti).  (2.4) 

di&e-i 

2.2.2  Uncertainty  Equilibrium 

We  then  have  the  following  definition. 

Definition  10  (Uncertainty  Equilibrium  of  Q{ 7r)). 

The  pair  of  sets  (X1,X2)  is  an  uncertainty  equilibrium  for  players  with  attitudes  7r,  if  X,  = 
ifi(Xj;  irf)  for  i  =  1,2  and  j  =  3  —  i. 

Moreover,  if  the  uncertainty  equilibrium  is  unique,  we  consider  that  player  i  plays 
Xi  G  ri(Xj,Oi]irf)  to  maximize  her  interim  anticipated  reward  fi(xi,Xj,Oi,irf).  If  the  cor¬ 
responding  Xi  is  unique  and  equal  to  XjfOi,  n),  it  results  in  actual  (ex-post)  rewards  Ui  :  = 
Ui{xi{0i,if),Xj{0j,if),0f).  If  the  context  is  clear,  we  simplify  as  Ui(ir)  :=  Ui(xi(ir),Xj(ir),Of) 
where  oy(7r)  =  ay(#*,  7r). 


2.2.3  Attitude  Game 

Is  it  preferable  to  be  optimistic  or  pessimistic?  To  answer  this  question,  we  consider  a 
two-stage  game. 
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Definition  11  (Attitude  Game  A). 

In  the  first  stage,  the  players  choose  their  attitudes  {n\ ,  7r2)  G  {0,  l}2.  In  the  second  stage, 
they  play  Qfi r)  and  get  the  rewards  Ufi n). 

If  7r  =  (0,  0)  is  a  unique  Nash  equilibrium  for  the  two-stage  game,  we  conclude  that  the 
players  should  be  pessimistic.  Moreover,  the  analysis  then  specifies  precisely  how  they  should 
choose  their  second  stage  strategy.  The  situation  is  similar  if  any  n  E  [0,  l]2  is  a  unique  Nash 
equilibrium  attitude.  A  player  i’s  attitude  it*  is  said  to  be  dominant  if  for  any  i iq  and  03, 
3  =  3  -  i, 


Ui('K*,'Kj)  >  UfijTi,  TTj) 

for  all  7Tj. 

In  contrast  with  traditional  approaches,  we  do  not  consider  that  players  have  a  fixed 
attitude  (as  a  type).  Instead,  they  choose  their  attitude  by  analyzing  the  game  instead  of 
being  driven  by  a  preordained  risk  aversion. 

As  we  show  in  the  following  sections,  there  are  games  where  this  approach  enables  to 
rationalize  specific  strategies  under  uncertainty. 


2.3  Existence  of  Uncertainty  Equilibrium  and  its  Re¬ 
lation  to  Nash  Equilibrium 


This  section  provides  a  condition  for  the  existence  of  an  uncertainty  equilibrium. 


Theorem  2  (Existence  of  Uncertainty  Equilibrium). 

Assume  rj(Xj,  7q)  is  single-valued  and  continuous  in  X3,Q{  and  7 q.  Then  there  exists  an 
uncertainty  equilibrium  (A7(vr),  A^(vr)). 

At  an  uncertainty  equilibrium  (X)"(7r),  X^  (71)),  i’s  best  response  is 

W  =  r 'i(Xj(n),9i,  nfi). 

From  the  proof  of  Theorem  2,  note  there  is  one-to-one  correspondence  between  a;*(7r)’s  and 
X* (7r)’s  via  Ti  s.  In  particular,  if  @,:  is  a  singleton,  then  X* (71)  =  x*{i r).  This  (obvious) 
observation  is  stated  in  the  next  theorem. 
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Theorem  3.  Under  the  assumptions  of  Theorem  2,  Q{i r)  ’s  uncertainty  equilibrium 
(A*(7t),  X£  (7r))  coincides  with  game  Qa’s  Nash  equilibrium  (x*,  x£)  if  ©,:  =  {9,}  fori  =  1,2, 
irrespective  ofn. 


2.4  Examples 

The  first  example  is  a  game  with  negative  externality.  In  this  game,  the  players  should 
be  optimistic  even  when  they  are  uncertain  about  the  opponent’s  type.  The  second  example 
is  a  game  with  positive  externality.  In  this  game  the  players  are  better  off  when  they  both 
are  pessimistic  than  when  they  both  are  optimistic.  However,  we  will  see  that  players  are 
inconclusive  in  the  choice  of  attitudes  because  there  are  two  pure  Nash  attitudes.  The 
third  example  is  a  Cournot  duopoly  game  [15]  with  uncertainty.  For  this  game,  we  study 
conditions  for  the  existence  of  dominant  attitudes,  and  robust  attitudes.  For  clarity,  the 
algebraic  derivations  are  in  the  appendix. 


2.4.1  A  Game  with  Negative  Externality 

Consider  two  agents  *  =  1,2  who  consume  resources  x*  G  [0, 1]  to  gain  some  benefit.  The 
consumption  degrades  the  quality  of  the  environment  which  affects  both  players.  The  agent’s 
reward  is  defined  to  be  the  benefit  minus  the  degradation  of  the  environment.  The  benefit  is 
assumed  to  be  proportional  to  the  consumption.  The  environment  degrades  exponentially  in 
sum  of  players’  consumption  (expjxi  +  x2}),  via  scaling  factor  exp{— 6**},  where  Of1  captures 
i’s  susceptibility  to  the  environmental  degradation.  Here,  0*  is  private  information  for  agent 
i.  Xi  G  [0, 1],  9i  G  [a,  0\  for  some  0<a<2a</3<l.  Agent  i’s  reward  is 


Ui(x,  9i )  =  Xi-  exp{— +  x»  +  Xj}. 

(One  may  add  a  constant  to  make  the  rewards  positive.) 

Theorem  4.  Agents  should  be  optimistic  and  choose  the  consumption  levels  x*  =  9t  —  a/2 
for  i  —  1,2.  In  contrast,  if  9±,  92  are  fully  known  and  9\  <  92,  then  the  only  Nash  strategy  is 
(xi,x2)  =  (0,  92). 

For  this  game,  the  only  consistent  sets  (see  Definition  8)  are  X\  =  X2  =  [0,/3],  which 
provides  little  information  about  the  strategies  of  the  agents. 
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2.4.2  A  Game  with  Positive  Externality 

Consider  two  agents  i  =  1,2  and  j  =  3  —  i  who  spend  the  effort  aq  >  0  to  gain  some 
benefit.  There  is  a  positive  externality  in  benefit:  the  opponent  agent’s  effort  spills  over  and 
affects  positively  the  agent’s  benefit.  The  sensitivity  of  agent  i  to  the  spill-over  is  her  private 
information  9t.  Agent  i’s  utility  is 


Let,  for  all  i,  0.,  =  [1/4, 1/2], 


U.j  = 


Xj 


OiXj 


Xi 


Certainty  game 

It  is  easy  to  see  this  game  has  a  free-riding  effect.  Both  agents  free  ride  on  each  other, 
to  some  degree.  A  social  planner  would  make  agents  invest  more  than  they  do  at  the  Nash 
equilibrium.  First,  we  study  a  Nash  equilibrium  for  the  full  information  game.  From  the 
first  order  condition,  we  find 


duj 

dxi 


1  1 

2  y/Xi  +  OiXj 


- 1 


0. 


Thus  i’s  best  response  to  x,  is 


Xi  =  [-  -  6iXj\+. 

This  game  has  only  one  pure  Nash  equilibrium 

1  1  -0i 

Xi  = 


1  -  diOj  4 


with  corresponding  utility 


1  1  -9i 

2  4(1  —  O.fij) ' 


Uncertainty  game:  Optimism  case 

Assume  that  both  agents  are  optimistic  and  that  both  know  this:  they  choose  the  atti¬ 
tudes  7r  =  (0,0).  Starting  with  Xt  =  [cj,  di\  at  equilibrium,  since  iq  is  increasing  in  xv  we 
see  that  an  optimistic  agent  assumes  that  the  other  agent  selects  her  largest  strategy.  As  a 
result, 


Xi  = 


20 


Thus 


Xi  ^dj,  4  ^dj]  .  [ Ci,di] . 

The  unique  equilibrium  is  Xt  =  [3/20, 1/5].  Accordingly,  i’s  strategy  becomes 


Uncertainty  game:  Pessimism  case 

Assume  now  that  both  agents  are  pessimistic  and  that  both  know  this:  they  choose  the 
attitudes-^  =  (P,P).  Arguing  as  before,  a  pessimistic  agent  assumes  that  the  other  agent 
selects  her  minimum  strategy.  Hence,  starting  with  Xt  =  [cj,dj],  we  find 


Thus 


x*  =  [4  -  9i°j]+- 


1111 
X%  —  —  2 cb  4  —  := 


The  unique  equilibrium  is  Xt  =  [1/6,5/24],  Accordingly,  i’s  strategy  becomes 

x’(P,P)  =  1(1  -?<*)• 

We  can  see  that  a  pessimistic  agent  invests  more  than  an  optimistic  one. 


A  right  attitude 

We  can  continue  similar  analysis  for  n  =  (0,P)  and  it  =  (P,0).  For  any  #*,  one  can 
easily  show  that  the  following  inequalities  hold: 

U\yop  >  Ui^pp  >  U\  po  >  Uitoo  f°r  all  d2  €  @2, 

and 

U2,po  >  f^2,pp  >  U2,op  >  U 2,00  f°r  all  0i  £  @i- 

Thus,  in  this  positive  externality  game,  the  ex-post  utilities  when  both  agents  are  pes¬ 
simistic  are  larger  than  when  both  of  them  are  optimistic.  It  is  easy  to  see  this  game  has 
two  pure  Nash  equilibria  :  (0,P)  and  (P,0).  Thus,  in  terms  of  picking  one  single  attitude, 
players  are  inconclusive. 
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2.4.3  Cournot  Duopoly  Game 


Full  Information  Case 

For  i  —  1,2,  selfish  and  rational  player  i  produces  a  non-negative  quantity  xt  of  homo¬ 
geneous  items  with  a  non-negative  production  cost  9t  G  [0, 1/2]  per  item.  The  selling  price 
per  item  is  (1  —  aq  —  x2)+  where  y+  =  maxjy,  0}  for  y  G  3?.  Accordingly,  the  reward  (profit) 
of  player  i  is  Ui(x,  9i )  defined  as  follows: 

Ui(x,  &i)  :=  Xi(  1  -  Xi-  x2)+  -  OiXi  (2.5) 

where  x  —  (aq,  x2). 

Player  i’s  strategy  is  the  quantity  Xi  to  produce.  The  value  of  aq  that  maximizes  Ui(x,  d* ) 
is  Xi  =  (1  —  &i  —  Xj)/ 2,  for  i  —  1,2  and  j  —  3  —  i.  The  unique  solution  of  these  equations  is 
the  Nash  equilibrium  x *  :=  (x\,x%)  where 

x?  =  (1  -  29 i  +  9j)/3.  (2.6) 

The  corresponding  utilities  are 

<  =  42-  (2.7) 

Note  that  the  pair  x  =  (xi,x2)  that  maximizes  uSOCiai  Yhi=\2ui(x->9i)  is  ((1  —  0i)/2,  0) 
when  9\<92.  This  “social  optimum”  is  quite  different  from  the  Nash  equilibrium.  There 

Usocial  =  (1  -  £l)74.  (2.8) 


Bayesian  Uncertainty  Case 

In  a  Bayesian  model,  one  assumes  that  9\  and  92  are  independent  with  known  distribu¬ 
tions;  each  player  %  —  1,2  knows  9%  and  only  the  distribution  of  9j  for  j  =  3  —  i,  and  this  is 
common  knowledge.  In  that  case, 

E[ui(x,0i)\xi,9i]  =  xi(l  -  Xi-  E[x2\xi,0i])  -  0iXi 

and  this  expression  is  maximized  by 

*!  =  (!-  E[x2\Xl,  9 1]  -  91)/ 2  =  (1  -  E(x2)  -  9,)/ 2. 

The  last  expression  follows  from  the  observation  that  x2  is  only  a  function  of  92  which  is 
independent  of  9\ .  Consequently,  for  i  —  1,2, 

E(xi)  =  (1  -  E(xj)  -  Hi)/ 2  where  Hi  ■=  E(9i). 

Solving  this  system  of  two  equations,  we  find 

E(x i)  =  (1  -  2 hi  +  H2)/3  and  E(x 2)  =  (1  -  2/x2  +  /u)/3. 
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Accordingly,  for  i  —  1,2, 

XitB  =  (2  —  3 6i  —  Hi  +  2/ij)/6  where  j  =  3  —  i.  (2.9) 

This  solution  is  a  unique  Bayesian  Nash  equilibrium.  Note  that  player  i’s  strategy  maximizes 
her  interim  expected  utility  E[ui(xB,0i)\xi,B,9i],  rather  than  the  ex  post  utility  ufxBjOi), 
which  i  cannot  compute. 


Consistent  Set 

Recall  the  definition  of  a  consistent  set.  It  provides  a  strategy  bound  beyond  which  a 
rational  player  should  not  play  when  non-probabilistic  uncertainties  prevail. 

Consider  that  player  i’s  best  response  to  Xj  at  #*,  Xi(xj ,  Oi),  which  is  real  and  continuous 
in  Xj  and  0*.  Then  (f>j(X,)  is  a  continuous  and  compact  interval  for  any  continuous  and 
compact  Xi.  Suppose  X\  =  [a, b]  and  X\  =  [c,d].  Using  definition  0*  =  [cq,/^],  we  find 

Theorem  5.  The  consistent  set  for  Cournot  duopoly  game  is  unique  and  is  given  as 

X\  =  [(1  -  2/3i  +  o2)/3,  (1  -  2«i  +  (d2)/ 3]  and 
=  [(1  —  2 (32  +  «i)/3,  (1  —  2a2  +  /3i ) / 3] . 


Proof.  Recall  Xi(xj ,  0t)  —  (1  —  x3  —  d,;)/2.  Thus 

0!(X2)  =  [(1  -  d  -  A)/2,  (1  -  C  -  aO/2]  :=  [a,  6]. 


Similarly 


02(AR)  =  [(1  -  h  -  f32)/2,  (1  -  a  -  a2)/2]  :=  [c,  d]. 


Solving  the  two  equations  above  yields  the  result. 


□ 


Game  with  Attitudes 

One  assumes  that,  for  i  —  1,2,  player  i  knows  6t  but  only  that  9j  G  Qj  :=  [a,v  [If  for 
j  =  3  —  i  where  / 3j  <  1/2.  This  is  common  knowledge.  Moreover,  player  i  has  attitude 
Hi  G  [0, 1].  The  following  result  is  shown  in  the  appendix. 
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Theorem  6.  The  unique  uncertainty  equilibrium  with  attitudes  n  is  the  pair  of  intervals 
B[su  ti ]  :=  [si  -  U/2 ,  Si  +  til 2]  for  i  =  1,  2,  where 

Si  =  ^A jTTi  -  i'Kj  +  ^(4  -  3/3j  -  5 ai  +  4 af)  (2.10) 

and  Ai  :=  Pi  —  a i  and  ti  =  (Pi  —  ap/A.  The  strategies  that  maximize  the  interim  anticipated 
rewards  are 


xi  IT 


1  .  1  A 

gAA^7Tj|  “h  A*, 


(2.11) 


where  A*  =  (2  —  a*  +  2a j  —  Wp/Q. 

In  fact,  the  attitudes  are  added  degree  of  freedom  that  enables  rationalization  in  choosing 
a  strategy  out  of  the  consistent  set.  That  is,  the  recommended  strategy  does  swing  in  the 
entire  consistent  sets  by  varying  n.  That  result  is  expressed  in  the  following  theorem. 


Theorem  7.  The  consistent  set  (X{,X '\)  and  the  uncertainty  equilibrium  (X1(7r),  A"2(vt))  at 
it  establish  the  following  relation: 


for  i  —  1,  2. 


(2.12) 


Therefore,  the  attitude  structure  is  exhaustively  descriptive  in  expressing  any  feasible 
rational  strategy  in  consistent  sets.  Note  that  the  consistent  sets  do  not  require  imposition 
on  any  form  of  knowledge  about  uncertainties  except  they  are  defined  within  certain  ranges. 
Thus  the  same  consistent  sets  are  obtained  when  one  considers  a  set  of  rational  strategies  over 
a  family  of  probability  distributions  of  uncertainties  whose  supports  are  the  same  ranges.  As 
a  result,  one  can  expect  that  there  always  exists  a  pair  of  attitudes  that  corresponds  to  a  par¬ 
ticular  choice  of  probability  distributions  of  uncertainties.  Indeed,  when  a  Bayesian  Cournot 
game  assumes  distributions  of  (61,62)  with  mean  (n \,np,  the  corresponding  attitudes  are 

7 Ti  =  tp - -,  for  all  i. 

Pj  ~  aj 

This  result  is  obtained  by  equating  (2.9)  and  (2.11)  for  i  —  1,  2  and  solving  these  equations. 
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Rationalization 


The  strategies  (2.11)  are  rational  when  n  is  known.  In  an  attitude  game  A,  a  player  first 
decides  her  attitude  and  then  chooses  the  strategy  (2.11).  An  immediate  question  is  what  a 
rational  attitude  is.  First  of  all,  there  are  situations  where  a  player  can  choose  a  dominant 
attitude  based  on  its  private  information  and  common  knowledge,  but  not  on  the  opponent’s 
private  information.  The  following  lemma  is  proved  in  the  appendix. 

Lemma  1  (Dominant  attitude). 

Let  9t  :=  |(2  —  &  +  4aq  —  2 (3j)  and  9i  :=  |(2  —  a *  +  4 f3j  —  2aq).  Assume  that  the  attitude 
space  is  discrete  II  =  {0, 1}.  Then  the  following  properties  hold: 

T  If  8i  <  @i,  then  optimism  is  a  dominant  strategy  for  player  i; 

2.  If  9i  >  9i,  then  pessimism  is  a  dominant  strategy  for  player  i; 

3.  If  9i  <  9i  <  9i,  then  there  is  no  dominant  strategy  for  player  i. 

In  particular,  if  3t  <  1/3  for  i  —  1,2  (i.e.,  if  the  unit  production  costs  are  sufficiently 
low),  both  players  should  be  optimistic. 

The  game  is  said  to  be  symmetric  if  iq  =  u2  and  ©i  =  @2. 

There  is  a  connection  between  a  symmetric  attitude  game  and  a  Prisoner’s  Dilemma 
game  when  the  attitude  space  is  discrete  with  II  =  {0, 1}. 

Theorem  8.  Consider  the  symmetric  game  A  with  ©i  =  @2  =  [a,/?]  where  (3  >  a.  Then 
the  following  properties  hold: 

1.  (. PP )  is  never  a  Nash  equilibrium; 

2.  ( PP )  is  pareto  efficient; 

3.  (PP)  is  pareto  superior  to  (00); 

4 ■  O  is  the  dominant  strategy  if  (3  <  max(l/3,2 a),  so  that  (00)  is  the  only  Nash  equi¬ 
librium. 

Together  with  1),  2),  and  3),  the  condition  in  the  last  part  makes  the  attitude  game  a  Pris¬ 
oner’s  Dilemma.  The  last  condition  requires  that  the  costs  are  not  too  large. 
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Robust  attitude 


As  we  observed  from  the  previous  example,  game  A  may  not  have  a  dominant  attitude  for 
player  i.  In  such  a  case,  player  i  may  prefer  a  strategy  that  guarantees  the  largest  minimum 
ex-post  reward.  That  is,  player  i  might  seek  the  robust  attitude  x,  G  [0, 1]  defined  by 

7 rf  :  =  arg  max  min  ut (xi (0,: ,  x) ,  Xj ( 0j ,n),9i ) • 

TTi  TV  j 


Theorem  9.  The  robust  attitude  of  Cournot  duopoly  does  not  coincide  with  pessimism  and 
is  given  by 

ti\  =  min(l,  (2  —  39 i  —  +  2a?)/4AJ) 

for  Aj  >  0.  Consequently,  xf  >  0,  except  for  a  singular  case  otj  =  0  and  9,  =  ff  =  1/2. 


Example  1.  Let  (3  :=  max(/3j,  /3j).  Then  if  (3  <  1/4,  7r \  =  7r^  =  1.  That  is,  when  costs  are 
sufficiently  small,  the  robust  strategy  is  optimism.  To  see  this,  note  that  i rf  =  min(l,  (2  — 
3 9i  —  Pi  +  2aj)/A((3j  —  aj ))  >  min(l,  (2  —  4/3) / 4/3))  =  1. 


Rationalization  in  a  general  attitude  space  fl  =  [0, 1] 

In  real  economic  situations  demanding  decisions,  a  human  player  does  not  necessarily 
take  an  extreme  attitude  -  complete  optimism  or  complete  pessimism.  It  is  more  natural  to 
think  that  one  should  take  some  combination  of  optimism  and  pessimism.  That  is,  one  can 
be  7Tj-optimistic,  for  some  7Tj  G  [0, 1]. 

Define  the  bounds  of  a  rational  attitude  for  player  %  by 

Th  <  TTi  <  7 h,  (2-13) 

implying  that  it  is  not  rational  for  player  i  to  choose  an  attitude  outside  of  this  interval. 
In  other  words,  for  any  possible  true  value  of  the  uncertainty  that  i  has  and  any  possible 
rational  attitude  of  player  j,  player  T's  best  response  attitude  is  neither  x,  <  7p  nor  x,  >  xf.  If 
any  attitude  in  [0, 1]  is  rational,  then  x*  =  0  and  xf  =  1.  If  a  particular  attitude  is  dominant, 
then  X;  -  Xj. 

Recall  the  player  i’s  ex-post  utility  of  the  attitude  game  when  two  players  choose  x  = 
(vri,x2). 

Ui(n)  =  u.fx* (x),x*(x),6*j),  for  j  =  3  -  i. 
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Using 


we  find 


dx*(n) 

dni 


-Aj  and 
o 


dx*  (n) 
dni 


1 

6 


Ai? 


36 


dUj{n) 

dni 


2  —  39i  —  4Aj7Tj  —  Ain j  —  a*  —  4aj  +  69 j. 


In  order  to  achieve  the  largest  ex-post  utility,  player  i  should  select  the  largest  attitude 
as  long  as  ^^4  >  0.  Also  she  should  select  the  smallest  attitude  as  long  as  <  0.  To 

find  7 Tj  and  7p,  player  i  needs  to  consider  all  possible  values  of  93  and  n3. 

Assume  03  =  bj.  Then, 

36^  *  -  =  (2  -  3 6i  -  bpnf)  +  4Aj(l  -  7T*)  +  (2 b3  +  din3  -  a*)  >  0. 

OTTi 

Thus,  7Tj  —  1  can  always  be  a  rational  attitude.  Therefore  7fj  =  1  for  all  i. 

Also  is  minimized  at  7T,-  =  1  and  03  =  a3 .  Considering  7 Tj  is  defined  in  [0,1],  the 

ex-post  utility  is  maximized  by 


7 U  = 


1 

4a; 


min  ((2  —  39  i 


bi  +  2  a,j),  1), 


which  is  the  smallest  possible  rational  attitude.  These  observations  lead  to  the  following 
theorem. 


Theorem  10.  Player  i  should  not  select  an  attitude  beyond  the  interval 

min((2  —  30*  —  6*  +  2aj),  1),  1 


4A. 


Finally,  there  is  a  similar  dominant  attitude  criterion  for  a  continuous  attitude  space. 
The  proof  is  immediate  from  the  above  theorem,  and  thus  omitted. 


Theorem  11  (Dominant  attitude). 

Let  9t  :=  |(2  —  j3i  +  Qa3  —  4 /3j)  and  9i  :=  |(2  —  fa  +  2 aj).  Assume  that  the  attitude  space  is 
continuous  II  =  [0, 1] .  Then  the  following  properties  hold: 


1.  If  9i  <  9 0  then  ni  =  1  is  a  dominant  strategy  for  player  i; 

2-  If  9i  >  9i,  then  ni  =  0  is  a  dominant  strategy  for  player  i; 

3.  If  9i  <  9i  <  9i,  then  there  is  no  dominant  strategy  for  player  i. 
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Price  of  Uncertainty 


This  subsection  studies  the  effect  of  uncertainty  on  the  social  welfare.  The  social  welfare 
is  defined  as  the  sum  of  ex-post  utilities  of  players.  A  basic  question  is  how  bad  an  uncertainty 
is  to  social  welfare.  To  measure  this,  we  define  Price  of  Uncertainty  (PoU)  as  the  ratio  of 
sum  utilities  of  an  uncertainty  game  with  respect  to  the  sum  utilities  of  a  full  information 
game  at  its  Nash  ,  assuming  that  the  latter  is  unique. 


Definition  12  (Price  of  Uncertainty). 

The  price  of  uncertainty  of  the  Cournot  duopoly  game  with  attitudes  n  is  defined  as  follows: 


PoU  :  = 


^1=  1,2  Ui 


(2.14) 


where  u *  is  defined  in  (2.7). 

A  different  notion,  Price  of  Anarchy  captures  how  bad  the  lack  of  coordination  in  a  game 
affects  the  social  welfare  in  comparison  to  the  case  where  a  single  designer  optimizes  the 
society.  The  motivation  for  the  Price  of  Uncertainty  is  different:  one  cannot  avoid  a  game 
situation,  so  that  the  optimum  social  welfare  cannot  be  achieved. 

All  the  numerical  bound  of  the  Price  of  Uncertainty  can  be  found  by  simple  algebra  or 
numerical  analysis,  and  thus  the  derivations  are  omitted  for  simplicity. 

As  a  reference,  we  start  with  the  PoU  of  social  optimum  (SO)  case.  (This  is  just  a 
reciprocal  of  Price  of  Anarchy.) 

PoU(SO)  :=  ^1;2  ^’SQ  ,  (2.15) 

Li=l,2< 

ui  so  is  found  at  (2.8).  Together  with  (2.7), 

1  <  PoU(SO)  <  (2.16) 

The  lower  bound  is  obtained  when  (6\,  d2)  =  (0, 1/2),  and  the  upper  bound  is  obtained  when 
02  —  |(1  +  40i )  for  any  0  <  9\  <  |. 

For  the  second  reference,  we  consider  the  Price  of  Uncertainty  of  Bayesian  game.  Consider 
a  family  of  distributions  over  0i,02  with  support  of  ©i,©2  respectively.  Designate  fi  = 
(/U,/u)  as  the  means  of  the  uncertainties.  Then  define 


PoU  (Bayes)  : 


1,2  ui,  Bayes 


(2.17) 


where  to  J3ayes  :=  u(xs,6i)  where  in  turn  Xb  is  found  at  (2.9). 
Then 


(  <  Poll (Bayes)  <  5 


(2.18) 


and  the  lower  bound  is  obtained  at  (61,62)  =  (0,1/2)  and  (hi, /12)  =  (1/2,0),  and  the  upper 
bound  is  obtained  at  (61,62)  =  (3/8, 1/2)  and  (//1 , /x2)  =  (0, 1/2). 

Although  these  bounds  are  feasible,  it  does  not  make  sense  for  a  player  to  willingly  change 
the  distributions  of  uncertainties. 

Finally,  define  the  Price  of  Uncertainty  of  an  attitude  game 


V,.  ,  Ui 

PoU  (Attitude)  :=  — — — — . 

Ei=l,2  ui 


(2.19) 


Even  for  the  same  (61,62),  players  can  willingly  change  their  attitudes  as  long  as  their 
private  information  does  not  yield  a  particular  dominant  strategy.  It  is  found 


39  5 

—  <  PoU( Attitude)  <  -.  (2.20) 

64  4 

The  lowerbound  is  obtained  when  (61,62)  =  (0,1/2)  and  (7Ti,7t2)  — ■>  (3/4,1).  The  upper- 
bound  is  obtained  when  (61,62)  =  (3/8, 1/2)  and  (7Ti,7T2)  =  (1,0). 

Note  the  lower  bound  of  PoU(Attitude)  and  that  of  PoU  (Bayes)  are  obtained  at  the 
same  6[s.  However,  at  the  worst  case,  the  first  is  larger  than  the  second.  One  implication 
follows:  Consider  a  hypothetical  scenario  where  players  are  not  given  distributions  for  the 
uncertainties.  Suppose  a  system  designer  seek  the  worst  case  social  welfare.  If  he  assumes 
the  use  of  Bayesian  game  mechanism,  he  would  search  for  the  distributions  that  will  yield 
the  least  PoU,  which  will  be  1/2.  Now  instead,  if  he  assume  the  use  of  Attitude  game 
mechanism,  he  would  found  the  least  PoU  larger  than  1/2.  Obviously  the  designer  would 
prefer  attitude  game  mechanism. 

One  intuitive  explanation  behind  this  is  optimism  framework  provides  an  additional 
degree  of  freedom  in  rational  strategy  set  (consistent  set),  and  thus  enables  players  to  play 
more  strategically. 


2.5  At  least  one  player  does  not  prefer  pessimism 

We  identify  conditions  when  pessimism  cannot  be  dominant  for  both  players. 

The  Erst  theorem  proves  this  for  the  non-symmetric  Cournot  duopoly  game.  The  follow¬ 
ing  theorem  is  for  a  more  general  utility  structure  of  symmetric  games. 
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Theorem  12.  Both  Cournot  duopoly  players  cannot  simultaneously  have  pessimism  as  their 
dominant  attitude. 

Now  we  consider  a  more  general  utility  function  case. 

Theorem  13.  Consider  a  symmetric  game  where  Ui  is  strictly  monotonic  in  x3  and  r^Xj,  9i ) 
is  single  valued  and  strictly  monotonic  in  x3  and  9^.  Then  pessimism  cannot  be  a  dominant 
attitude  for  any  of  the  two  players. 


2.6  Conclusions 

This  chapter  proposes  a  framework  to  analyze  two-player  games  with  non-probabilisitc 
information  uncertainty.  The  formulation  allows  a  rational  player  to  choose  an  attitude 
against  uncertainty  characterized  by  a  degree  of  optimism.  Corresponding  to  a  pair  of 
attitudes,  we  define  an  uncertainty  equilibrium  as  a  pair  of  sets  of  strategies  from  which 
rational  players  would  not  depart  unilaterally.  This  concept  coincides  with  the  traditional 
Nash  equilibrium  when  there  is  no  uncertainty.  We  then  define  a  two-phase  game  where 
players  first  choose  their  attitude.  Finally,  we  illustrate  the  framework  with  a  consumption 
game  and  a  Cournot  duopoly  game  with  uncertainty.  We  show  that  the  framework  may 
identify  uniquely  the  strategies  of  the  players. 


2.7  Proofs 


This  section  presents  the  proofs  of  the  results  of  this  chapter. 


2.7.1  Proof  of  Theorem  2 

Since  rl  is  continuous  in  9i:  and  0*  is  a  bounded  and  closed  interval,  Xt  is  a  closed 
interval.  Let  Xt  =  \xt,  xf\  C  X^0,  x{  <  X*.  We  define  a  map  0(Xj,x;)  =  (x',  x-)  such  that 

x-  =  arg  min  r^Qx,,  xA,  9.h  7 q) 
x-  =  arg  max  r^Qx,,  xA,  9h  i q) 

where 
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Figure  2.1.  fa  mapping 


Xj  =  arg  mm  ry  ( [ay ,  a*] ,  Oj ,  Try  ) 
Xj  =  arg  max  ry  ( [ay ,  ay] ,  ^ ,  7iy ) . 

0je©j 


From  construction  a(  <  a:'.  If  (pi  is  a  continuous  mapping,  then  by  Brouwer’s  fixed  point 
theorem,  there  exists  (x*,x*)  G  XfQ  such  that 


fa(x*,X*)  =  (x*,x*) 


Then  Xt  =  [a* ,  x* ]  is,  by  definition,  an  uncertainty  equilibrium.  Now  we  show  that  fa  is 
continuous  in  ay,  ay. 

Let  v  :=  y(ay,  ay)  :=  arg  sup  Xie\x-,xi]uj(,xj^xi^j)  and  define  z  such  that  ay  —  e  <  z  < 
ay  +  e.  Then  lim e_>0Uj(xj,  z,9j)  =  Uj{xj,x^0j)  from  u/s  continuity.  There  are  two  cases: 
(1)  y(ay,ay )  >  ay.  Then  y(z,ay )  =  y(ay,  ay)  as  for  small  e.  (2)  y(ay,  ay)  =  ay.  Then 
ay  —  e  <  w  :=  y(z ,  a*)  <  ay  +  e.  As  a  result 


suPa:i  e  [z,xj]  (T?  >  )  suP^i  e  [x;  ,x  j]  (-L?  >  xi  •>  @j ) 

=  Uj(xj,  w,  Qj )  —  Uj(xj,  Xi,  Oj)  — >  0 


as  e  — >  0  from  ty’s  continuity. 

Therefore  sup^.g^.  ^.i  Uj(xj,  9j)  is  continuous  in  ay.  Similarly  we  can  show  it  is  con¬ 
tinuous  in  ay.  These  steps  can  be  repeated  for  inf Xie]x-,xi\uj(xj,Xi,0j).  As  a  result  f3  and 
ry  are  continuous  in  ay ,  ay .  Since  r3  is  continuous  in  9j  and  ©j  is  a  closed  and  bounded 
interval,  Xj  :=  [ay,  ay]  :=  {ryQ^,  ay],  7ry|0y  G  @y}  is  a  closed  interval  too.  Using  the  same 
procedure,  a'  and  x\  are  continuous  in  ay,  ay.  Since  fa  is  a  composite  function  of  continuous 
functions  in  ay,  ay,  fa  is  therefore  continuous  in  (ay, ay).  This  completes  the  proof. 
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2.7.2  Proof  of  Theorem  3 


Let  0,;  =  {dj}  for  all  i.  Then  for  arbitrary  Xj,  X,  :=  {ry(Xj,  9i,  7Tj)|#j  G  0,:}  is  a  singleton. 
Let  Xi  =  {x\}.  Then 

x'j  :=  rj(Xi,9j,TTj)  =  arg  max  Uj(xj,x\,6j ) 

is  j's  best  response  function  of  game  Q0  when  j  predicts  i  plays  x\.  By  assumption  an 
equilibrium  of  this  is  a  (x\,x%).  And  by  construction,  it  is  also  an  uncertainty  equilibrium 
(X*(7r),  X£  (7t))  of  Q(j r),  and  it  does  not  depend  on  ir. 


2.7.3  Proof  of  Theorem  4 

The  partial  derivative  with  respect  to  Xi  is  1  —  exp{— +  Xi  +  Xj},  which  is  positive 
for  Xi  <  9i  —  Xj  and  negative  for  ay  >  9i  —  xr  Accordingly,  the  best  response  ay  (ay)  is 
Xi{xj)  =  [9i  —  Xj}+.  If  9i  <  9j,  the  only  Nash  equilibrium  is  then  ay  =  0 ,Xj  =  9j.  The 
outcome  of  the  game  is  very  sensitive  to  the  order  of  the  parameters. 

Assume  i  knows  that  Xj  G  Xy  For  zGt,  define  [^]q  :=  min{max{z,  0},  1}.  Then,  if  i  is 
optimistic,  she  maximizes  ay  —  exp{— 9i  +  x^  +  ay}  where  oij  =  min  Xj.  Thus, 

ay  =  [Qi  -  ay]o  G  [[a  -  a y]J,  \(5  -  aj] J]. 

Also,  if  i  is  pessimistic,  she  maximizes  Xi  —  exp{— 0*  +  ay  +  /3j}  where  /3j  =  maxXj.  Thus, 

Xi  =  [Qi  -  frM  G  [[a  -  0j\ l,  [p  -  f3j]l]. 

Suppose  both  players  are  optimistic.  Then  the  only  uncertainty  equilibrium  is  Xi  =  Xj  = 
[a,  b]  where  a  =  a  —  a  and  b  =  (3  —  a.  Hence  Xt  =  Xj  =  [f ,  P  —  f  ]■  Consequently,  ay  =  9i~  | 
and 

Ui{  1, 1)  :=  0j  -  -  exp{0j  -  a}. 

Second,  suppose  both  players  are  pessimistic.  Then  the  only  consistent  sets  are  Xj  = 
Xj  =  [a,  b]  where  a  =  a  —  b  and  b  =  (3  —  b.  Hence,  Xj  =  Xj  =  [a  —  f,  f].  Consequently, 
Xi  =  9i  —  |  and 

Ui{ 0,  o)  :=  Qi  -  ^  -  exp{6>j  -  p}. 

Third,  suppose  that  player  1  is  optimistic  and  player  2  is  pessimistic.  In  that  case, 
the  only  consistent  sets  are  Xi  =  [ai,^]  and  X2  =  [02,62]  where  cq  =  [a  —  a2]J,  61  = 
\P  —  a2]o,a2  =  [a  —  &i]J,&2  =  [P  —  61] J.  Hence,  Xt  =  [a,/3]  and  X2  =  {0}.  Consequently, 
xi  =  9\  and  a;2  =  92  —  P,  so  that 

f/i(l,  0)  :=  9\  -  exp{6»2  -  /?}. 
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By  symmetry, 


Ui( 0, 1)  :=  0i  -  (3  -  exp {02  -  P}- 


By  inspection,  we  see 

Ui(  1, 0)  >  f/i(0,  0)  and  Ui(l,  1)  >  C/i(0, 1). 

Thus,  optimism  is  a  dominant  strategy  for  player  1.  By  symmetry,  it  is  also  dominant  for 
player  2. 


2.7.4  Proof  of  Theorem  6 


The  proof  goes  in  following  steps:  First  we  define  the  uncertainty  set  as  a  ball.  Then 
we  show  the  ball’s  radius  is  constant.  Finally  we  show  the  center  of  the  ball  is  fixed  at 
equilibrium.  Note  ut  is  negatively  affine  in  Xj.  Let  XQ  =  [0, 1/2]  be  the  strategy  space.  Thus 
inf  Xj  =  argsupXjeX;  ut(x,  0t)  and  sup  Xj  =  arg  iiif.rjex?  u^x,  9,).  Define 

hi(Xj,  7 Ti)  =  7 Ti  inf  Xj  +  (1  -  TTi)  sup  Xj. 

Then  fi(xi,  Xj,9i}7Ti)  =  Ui(xi,  hi(Xj,  np,  Op.  From  the  first  order  condition  and  definition, 
i’s  best  response  to  X3  becomes 

I'iiX 77  i)  =  (1  -  hi(Xj:Hi)  -  0;)/2. 


This  yields 


sup  Xi  =  (1  -  r i(Xj,  TTi)  -  OLi)/ 2 
inf  Xj  =  (1  —  r i(Xj,  7Tj)  —  A)/2. 


Now  let  X*  =  B[si,ti]  for  i  —  1,2  and  j  ^  i  where  B[s,t\  is  a  closed  ball  or  radius  t 
centered  at  s.  Then 


U  =  (supX;  -  inf  X* ) / 2  =  Ai/4, 

where  A i  •=  fa  —  oti.  This  is  independent  of  X* ,  X* ,9i,9j.  Now  since  sup  A/*  =  Sj  +  tj  and 
inf  AT*  =  Sj  —  tj , 

hi(Xj,  TTi)  =  Sj  +  tj(  1  -  27 Tj). 

Define  a*  :=  (ccj  +  Pi)/ 4,.  Then 

Si  =  (sup  X*  +  inf  X*)/2 

=  (1  -  n(x*))/ 2  -  Oi  =  (1  -  Sj  -  tj(l  -  2tt<))/2  -  (Ti 

for  i  —  1,2.  We  have  two  equations  relating  Si  and  Sj.  By  solving  algebra,  we  get  (2.10). 
i’s  best  response  at  uncertainty  equilibrium  Xi  =  ri(X/,  0i,  np  becomes  (2.11).  B[si,ti\  then 
is  uniquely  determined  by  given  {i r,  0j,  ©,;,  0^).  To  show  its  existence,  it  is  sufficient  to 
show  B[si,t/\  C  XQ.  To  see  this,  it  is  straightforward  to  verify  min  s*  +  L  <  supXD  and 
max  st  —  ti  >  inf  X0  for  all  combinations  of  tt,  @i,  @2. 
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2.7.5  Proof  of  Lemma  1 


1.  We  need  to  find  a  condition,  without  loss  of  generality,  such  that  (i)  u\(00)  >  u\(PO ) 
and  (ii)  U\(OP)  >  ui(PP)  for  every  02  G  [a2,/32].  By  algebra, 

Ul(00)  -  Ul(PO)  >  A2\61  -  0i]/ 12, 

which  is  non-negative  for  all  <  0,  :=  |(2  —  /?i  +  4a2  —  2/32)  and  for  all  02.  (ii)  is 
immediate  because 


ui(OP)  -  ui(PP)  >  ui(00)  -  Ul (PO). 

2.  Similar  development  yields  9t  >  9i  where  9,  :=  |(2  —  a*  —  2 aj  +  4 a*). 


2.7.6  Proof  of  Theorem  8 

1.  We  show  at  least  one  player  always  have  an  incentive  to  deviate  from  ( PP ).  This 
part  of  the  theorem  is  true  even  for  non-symmetric  ©i  and  ©2.  Define  rq(7 r)  :  = 
Ui(x*(n),  x*(n),  9i)  and  A  =  f3  —  a.  Suppose  player  1  does  not  have  the  incentive  to 
deviate  from  (PP).  That  is,  u\(PP)  >  U\{OP).  Then  we  prove  by  showing  u2(OP)  > 
u2(PP).  From  the  proof  of  Lemma  1,  u\(PP)  >  u\(OP)  is  equivalent  to  30i  > 
2  -  3a  -  2f3  +  602.  Then,  u2(PO )  -  u2{PP)  =  ^(2  -  3a  -  2/3  +  60x  -  302)  > 
^-(6  —  9a  —  6/3  +  902)  >  0.  The  last  inequality  comes  from  the  boundary  condition 
0  <a<0*</3<  1/2. 

2.  We  show  that  a  rival  player’s  optimistic  attitude  is  always  detrimental:  36(«i(PP)  — 
ui(PO))  =  A(6xi(PP))  +  A(6  —  60i  —  6x\(PP)  —  6x2(PP)  —  A)  >  0.  We  can  similarly 
show  36 (tti (OP)  —  U\(00))  >  0.  At  (PP),  suppose  one  player  has  incentive  to  change 
to  O.  That  change  hurts  the  ex  post  utility  of  the  other  player.  This  concludes  (PP) 
is  pareto  efficient. 

3.  We  need  to  show  Ui(PP)  >  Ui(00).  To  see  this,  36(tq(PP)  —  Ui(OO))  =  12Aaq (PP)  — 
A(6  -  60i  -  6x1  (PP)  -  6 x2(PP)  -  2A)  =  A(2  +  2a  +  2/3  -  3 0X  -  302)  >  0. 

4.  If  f3  <  max(l/3,2a),  then  9{  >  (3  >  9X  for  all  i,  and  importantly,  this  fact  becomes  a 
common  knowledge.  From  Lemma  1,  O  is  the  dominant  strategy.  Together  with  1),  2) 
and  3),  this  constitutes  a  Prisoner’s  Dilemma  game. 


2.7.7  Proof  of  Theorem  9 

Ui(q* (n) ,  9i)  is  non-increasing  in  i\j  for  all  possible  combinations  of  parameters.  Thus  iq 
is  minimized  at  nj  =  1.  rq  is  convex  in  TCt.  From  the  first  order  condition,  the  result  is 
immediately  obtained. 
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2.7.8  Proof  of  Theorem  12 


Suppose  player  l’s  dominant  attitude  is  pessimism.  From  Lemma  1,  this  implies 

(3i  >  91  >  01  =  (2  -  ai  +  4/32  -  2a2)/3. 

Now  then, 

02  =  (2  —  a2  +  4/3i  —  2oq)/3 

>  (14-10a1-lla2  +  7/32)/9  +  /32>/32. 

Thus  62  <  (32  <  02-  Therefore  pessimism  cannot  be  player  2’s  dominant  strategy. 


2.7.9  Proof  of  Theorem  13 

Consider  player  1  representatively.  We  will  show  Ui(00 )  >  Ui(PO)  for  some  02  €  02. 
Let  u  :=  Ui,  r  :=  r*  and  0  :=  [a,/?]  =  0,;  for  i  —  1,2.  a  <  f3.  As  one  case,  assume  tp  is 
strictly  decreasing  in  Xj,  rt  is  decreasing  in  Xj  and  01:  both.  The  conclusion  is  the  same  if 
any  of  ‘decreasing’  condition  is  changed  to  ‘increasing’  condition.  Define  equilibrium  sets  for 
each  7 r  as  follows: 

Xi  =  X2  =  [a,  b]  for  7 r  =  (00); 

X,  =  X2  =  [c,  d]  for  7r  =  (PP); 

Xi  =  [e,  /],  X2  =  \g,  h]  for  tt  =  (OP); 

Xi  =  [ g ,  h],  X2  =  [e,  /]  for  tt  =  (PO). 

Then 

a  =  r(a,  /?)  and  6  =  r(a,  a); 
c  =  r(rf,  (3)  and  d  =  r(d,  a ); 
e  =  r(g,P)  and  /  =  r(g,a ); 
g  =  r(fi  0)  and  h  =  r(f,a). 

From  monotonicity  of  r,  we  draw  relation  one  by  one:  From  a  =  r(a,  /3)  and  d  =  r(d,  /3),  it  is 
immediate  to  see  a  <  d.  Noting  d  =  r(r(d,a),a )  and  g  =  r(r(g,a),  f3),  we  get  g  <  d.  Thus 
d  <  f  from  d  =  r(d,  a)  and  /  =  r(g,  a).  From  a  <  /,  we  get  g  <  a.  Finally  we  get  a  <  e. 
Take  02  =  (3.  Then, 

Pi(OO)  =  u(xi(&i,  OO),x2(02, OO),0i) 

=  u(r(a,  0i),r(a,  02),  0i) 

=  u(r(a,0i),r(a,f3),0i) 

=  u(r(a,0i),a,0i) 

>  u{r(f,0  i),a,0i) 

>  u{r(f,01),e,01)  =  U1(PO). 

Therefore  pessimism  cannot  be  a  dominant  attitude  in  a  symmetric  game. 
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Chapter  3 


A  Cooperative  Game  with 
Non-probabilistic  Uncertainty 


The  motivation  underlying  this  chapter  is  to  analyze  the  effect  of  uncertainty  on  the 
design  and  performance  of  protocols.  The  chapter  considers  two  types  of  situation.  The 
first  is  when  different  nodes  in  the  network  have  bounded  knowledge  about  what  other 
nodes  know.  The  second,  called  common  knowledge  about  inconsistent  beliefs,  is  when 
the  information  is  inconsistent  but  everyone  knows  it.  Situations  of  bounded  or  inconsistent 
information  arise  naturally  in  networks  because  the  state  of  these  systems  changes  and  nodes 
take  time  to  learn  of  those  changes. 

The  specific  problem  that  this  chapter  explores  is  the  relaying  of  packets  in  a  simple 
butterfly  network.  Despite  its  apparent  simplicity,  this  problem  enables  to  illustrate  key 
features  of  situations  of  uncertain  knowledge  that  arise  in  networks.  This  chapter  presents 
two  impossibility  facts  and  one  possibility  fact.  In  the  latter,  we  introduce  a  scheme  that 
enables  optimal  coordination  given  persisting  imperfection  in  knowledge. 


3.1  Introduction 

This  chapter  studies  the  impact  of  bounded  or  inconsistent  information  on  the  perfor¬ 
mance  of  a  simple  relay  network. 

In  a  network,  nodes  typically  implement  distributed  protocols  for  routing,  relaying,  dis¬ 
covery,  leader  election,  congestion  control,  and  other  operations.  Generally,  the  nodes  have 
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delayed  and  incomplete  information  about  the  state  of  the  network.  It  is  therefore  natural 
to  question  the  impact  of  this  incomplete  information  on  the  performance  of  the  protocols. 

A  first  line  of  inquiry  considers  delays  and  lack  of  synchrony  among  the  nodes.  A  repre¬ 
sentative  result  is  that  a  distributed  Bellman-Ford  protocol  converges  to  the  shortest  paths  if 
messages  are  eventually  delivered  between  nodes,  assuming  that  the  network  topology  does 
not  change  [10].  More  general  results  concern  the  convergence  of  parallel  and  distributed 
algorithms  [9]. 

A  second  tread  of  investigation  addresses  impossibility  theorems  for  distributed  appli¬ 
cations.  An  early  result  is  the  impossibility  of  two  generals  to  agree  with  certainty  when 
messages  they  exchange  have  some  probability  of  not  being  delivered  [20,  29].  Another  well- 
known  result  is  the  Byzantine  general  problem  where  loyal  generals  cannot  agree  on  whether 
to  attack  or  retreat  if  at  least  one  third  of  the  generals  are  traitors  [24,  31]. 

In  game  theory,  a  related  formulation  of  the  imperfection  of  information  has  received 
considerable  attention  after  the  publication  of  Rubinstein’s  electronic  mail  paper  [36].  In 
that  paper,  two  friends  exchange  lossy  messages  to  decide  whether  to  go  out  for  coffee.  One 
friend  knows  that  the  weather  is  bad  and  tries  to  agree  with  his  friend  that  they  should 
postpone  their  going  out.  Even  after  a  large  number  of  messages,  they  may  end  up  not 
making  the  correct  joint  decisions. 

This  chapter  examines  similar  situations  where  different  nodes  should  coordinate  their 
actions  to  prevent  a  bad  outcome.  However,  because  of  imperfection  of  knowledge,  the  nodes 
may  choose  the  wrong  actions.  We  focus  on  a  simple  example  where  only  one  of  two  nodes  in 
a  network  should  relay  a  packet  to  prevent  a  collision.  The  difficulty  is  that  the  nodes  do  not 
know  perfectly  the  two  probabilities  of  success  nor  what  the  other  node  knows.  Even  after 
exchanging  an  arbitrarily  large  number  of  ‘link  state  messages’  the  nodes  may  end  up  making 
the  same  decision  of  either  relaying  the  packet  or  not.  The  goal  is  to  explore  protocols  that 
avoid  such  pitfalls  and  are  robust  with  respect  to  imperfect  knowledge. 

The  Erst  part  focuses  on  the  impact  of  bounded  knowledge.  The  second  part  studies 
the  situations  where  the  nodes  have  inconsistent  beliefs  but  they  know  it  as  a  common 
knowledge. 

Many  other  protocol  design  problems  face  similar  difficulties,  such  as  leader  election, 
routing,  and  forwarding.  We  hope  that  the  discussion  here  will  increase  awareness  of  this 
issue. 


3.2  Problem  Formulation 

Consider  the  network  shown  in  Figure  3.1.  There  are  four  wireless  nodes:  S,A,B,  and 
D.  At  time  0,  node  S  broadcasts  a  packet  to  increase  the  chance  of  delivery  to  D,  and 
relay  nodes  A  and  B  receive  it  correctly.  At  time  1,  the  nodes  A  and  B  decide  to  forward 
the  packet  with  probability  a  and  6,  respectively.  If  node  A  forwards  the  packet,  it  gets  to 
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a 


Figure  3.1.  Butterfly  relay  network 


node  D  with  probability  pa-  Otherwise,  the  link  from  A  to  D  is  in  deep  fade  and  no  energy 
reaches  node  D.  The  situation  is  similar  for  node  B,  but  with  probability  ps  instead  of  pA- 
The  assumption  is  that  if  the  packet  reaches  D  both  from  A  and  from  B,  then  the  two  copies 
of  the  packet  collide  and  D  does  not  get  the  packet  correctly.  The  question  of  interest  is 
how  A  and  B  should  choose  the  probabilities  a  and  b  to  maximize  the  probability  n (a,  b ) 
that  D  gets  the  packet.  (One  can  think  of  a  more  general  scenario  where  A  and  B  receive 
the  packet  from  S  with  some  probability  or  where  simultaneous  forwarding  may  not  yield 
packet  loss.  It  does  not  change  the  conclusions  of  the  chapter.) 

From  the  description  of  the  system,  one  finds  n (a,  b )  is  given  by 

7r (a,  b)  =  pAa  +  psb  -  2pApBab.  (3.1) 


If  the  nodes  A  and  B  both  know  p  :=  ( Pa,Pb )  and  share  that  knowledge  as  a  common 
fact,  they  can  choose  the  values  a*  and  b*  such  that 


7 r(a*,&*)  =  7T*  :=  max7r(a,  6). 

a,b 


We  call  this  situation  perfect  knowledge.  Thus,  both  nodes  know  p  and  know  that  both  know 
it.  The  knowledge  is  common  and  exact:  the  nodes  know  the  precise  state  of  the  network 
and  they  both  know  that  precise  knowledge  is  common  to  both. 


It  is  easy  to  verify  that 


(a*,  b* 


(1,1),  if  p  e  [0,  l]2 

( 1{pa>pb},1{pa<pb }),  otherwise 


with  the  corresponding  optimal  performance 


f  Pa+Pb  ~  2pAPB,  if  pe  [0,  \]2 
1  ma  x{pa,Pb},  otherwise. 


(3.2) 


(3.3) 
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Roughly  speaking,  if  none  of  links  AD  and  BD  is  good  (pA  and  pb  are  small),  both  nodes 
should  relay.  Otherwise,  only  the  node  with  the  best  link  should  relay. 

However,  the  success  probabilities  p  of  the  links  change  over  time  and  the  nodes  can 
observe  only  their  local  link  directly.  Thus,  in  practice,  the  nodes  never  have  a  perfect 
knowledge.  One  practical  approach  is  for  the  nodes  to  exchange  ‘link  state’  messages  to 
improve  their  knowledge.  A  key  aspect  of  the  formulation  is  to  model  precisely  the  knowledge 
of  the  nodes  A  and  B  and  to  understand  how  this  knowledge  affects  their  decisions  and  the 
resulting  performance  measures  of  the  network. 

The  nodes  A  and  B  communicate  somehow  to  increase  their  knowledge  about  the  net¬ 
work.  Their  communication  path  is  not  explicitly  shown  in  the  figure.  The  nodes  exchange 
lossy  messages  and  we  examine  what  they  know  after  n  messages.  We  call  that  knowledge 
‘Level-n  knowledge.’ 

Initially,  before  they  exchange  messages,  we  assume  that  A  knows  pa  and  B  knows  p#. 
This  is  Levcl-0  knowledge.  Now  synchronously  each  node  sends  a  message  to  the  other. 
Node  A  sends  a  message  to  B  saying  ‘I  know  p^.’  At  the  same  time,  B  sends  a  message  to 
A  saying  ‘I  know  ps •’  (The  synchronous  assumption  is  relaxed  later.)  When  it  gets  that 
message,  B  knows  pb ,  and  that  A  knows  pa •  However,  B  is  not  sure  that  A  knows  that  B 
knows  pa-  Similarly,  A  knows  pa,  and  that  B  knows  ps  but  A  is  not  sure  that  B  knows 
that  A  knows  pb-  This  is  level-1  knowledge.  After  the  next  exchange  of  messages,  the  nodes 
have  Level-2  knowledge,  and  so  on.  Note  that  Level-n  knowledge  is  defined  when  the  nodes 
receive  the  n  messages,  even  though  the  nodes  assume  that  these  messages  can  get  lost. 

These  levels  of  knowledge  can  be  formalized  as  follows.  Let  the  notation  AA(0)  mean 
‘A  knows  pa •’  Similarly,  Kb( 0)  means  lB  knows  p#.’  Let  then  Ka(  1)  mean  ‘A  knows  pa 
and  A(b(0).’  That  is,  KA(1)  means  ‘A  knows  pa  and  that  B  knows  p#.’  Inductively,  define 
K4 (n  +  1)  to  mean  ‘A  knows  pa  and  A's(n)’  and  similarly  Ksiji  +  1)  to  mean  lB  knows  pb 
and  KA(n)'  for  n  >  0.  The  interpretation  is  that  the  (n  +  l)th  message  from  A  to  B  carries 
Ka(ti),  so  that  upon  receiving  it  node  B  knows  ps  and  Ka (n) ,  which  is  Kb(ji  +  1).  The 
situation  is  similar  with  A  and  B  interchanged. 

Of  course,  the  nth  message  from  A  may  get  lost,  in  which  case  the  knowledge  of  B  remains 
what  it  was  previously.  The  discussion  on  this  case  is  postponed  to  Section  3.4 

One  expects  that,  as  they  exchange  more  and  more  messages,  the  nodes’  knowledge 
approaches  perfect  knowledge.  However,  it  turns  out  that  the  values  of  the  relaying  proba¬ 
bilities  an  and  bn  that  the  nodes  choose  with  Levcl-n  knowledge  may  result  in  a  probability 
of  success  7T (an,bn)  that  does  not  approach  it*. 


3.3  Analysis 

To  study  the  impact  of  imperfect  knowledge  on  node  decisions,  we  explore  the  strategies 
of  the  two  nodes  A  and  B  under  different  levels  of  knowledge. 
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■  PA 


Ka(  1)  :  A  knows 


(a) 


B  knows -  Pb 

PB 


Kb(  1)  :■  B  knows 

^  A  knows -  Pa 


Figure  3.2.  (a)  Level-1  (b)  Level-n  +  1  knowledge  structure.  Upper  part  for  A,  lower  part 
for  B.  Note  two  dotted  boxes  contain  disparate  knowledge  for  B. 

3.3.1  Level-0 

Consider  first  the  case  of  Level-0  knowledge  where  node  A  knows  pA  and  node  B  knows 
Pb  but  not  more  than  that.  Since  node  A  does  not  know  anything  about  pb  and  what 
B  knows,  it  is  sensible  for  that  node  to  choose  a  value  of  its  relaying  probability  a  that 
guarantees  a  good  probability  of  success,  no  matter  what  pb  is  and  what  the  choice  of  node 
B  is.  That  is,  node  A  chooses  the  reliable  value  ao  of  a  given  by 

ao  =  arg  max  min  7r(a,  b ) . 

a  b,pB 

Similarly,  node  B  chooses  the  value  bo  of  b  given  by 

bo  =  arg  max  min  tt (a,  b). 
b  a, pA 


From  (3.1),  one  finds  that 


so  that 


7r(a,  b)  =  pAa  +  pBb(  1  -  2 pAa), 


min7r(a,  b) 

b,PB 


1  —  pAa:  if  1  <  2 pAa 
pAa,  otherwise. 
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Consequently,  the  maximizing  value  a0  of  rniiife:PB  n (a,  b)  is  given  by 


a0  =  min{  - — ,  1}. 
2  pa 


Similarly, 


b0  =  min{  - — ,  1}. 
2  Pb 


The  resulting  probability  of  success  is 


7r0  :=  vr(a0,  b0 )  =  PaOo  +  Pb&o  -  2pApBa0b0 
=  miri{-,/y4}  +  min{-,ps} 

1  1 

-2  mm{  - ,  pa}  - ,  pB) • 


We  hnd  that 


VTo 

7T* 


1, 


1 


2ma x(pa,Pb)  ' 


1]2 
2-1 

otherwise. 


if  P  e  [0,  i]: 


(3.4) 


(3.5) 


Note  that  <  7r0  <  7i*.  Therefore  the  network  is  guaranteed  not  to  lose  more  than  half  of 
the  performance  when  relays  have  Level-0  knowledge. 


3.3.2  Level-1 


After  exchanging  the  first  messages,  the  nodes  reach  Level-1  knowledge  KA{1)  and  KB(  1). 
That  is,  A  has  learned  that  B  knows  KB{§)  and,  consequently,  that  B  will  base  the  choice 
of  b  on  Kb( 0).  That  is  A  considers  that  B  will  choose  b  =  bo-  Accordingly,  A  chooses  the 
value  a  =  a±  such  that 

cq  =  argmax7r(a,  b0). 

a 

Since  b0  =  min}^-,  1},  one  finds 

7r  (a,  b0)  =  pa  a  +  min{^,ps}  -  pAa  min{l,  2  pB} 

=  pAa(  1  -  min{l,2ps})  +  min {^,pB}- 


Consequently  the  maximizing  value  cq  is  given  by 


CL\  — 


1,  if  Pb  <  l 

any  in  [0, 1],  otherwise. 


Similarly, 


f  1,  if  PA  <  \ 

\  any  in  [0,1],  otherwise. 


(3.6) 
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For  certain  set  of  p,  multiple  choices  are  possible  for  a\  and/or  bi,  which  in  turn  correspond 
to  different  values  of  the  probability  of  success  7r(ai,&i).  The  worst  case  is  a  possible  cost  of 
the  lack  of  knowledge.  Define  tt\  :=  min7r(ai,  bi).  One  finds 


7Ti 


0,  for  p  G  (|,  l]2 

Pa  +  Pb  ~  %PaPb,  otherwise. 


Consequently, 


n* 


1, 

0, 

PA+PB-ZPAPB 
max(pA  ,Pb) 


for  p  G  [0,  \}2 
for  p  G  (|,  l]2 
otherwise, 


(3.7) 


which  shows  that  the  imperfect  knowledge  can  reduce  the  probability  of  success  to  zero. 


3.3.3  Level-n  and  Failure  of  Convergence 

After  exchanging  (n  +  l)th  messages,  and  reaching  Level- (n  +  1)  knowledge,  node  A 
chooses  an+i  as  the  best  response  to  its  belief  about  the  node  B’s  decision,  and  similarly  for 
B.  That  is, 


an+ 1  =  argmaxmin7r(a,  bn) 

a  bn 

bn+ 1  =  arg  max  min  n(an,  b ) . 

b  CLn 


ft  is  easy  to  verify  that  a2k  =  a o>  b2k  =  b0  and  a2k+ 1  =  a±,  b2k+i  =  bi  for  all  k  G  Z+.  Since 
a0  7^  a.i  and  b0  ^  bi,  one  sees  that  the  solution  does  not  converge  as  the  level  of  knowledge 
increases.  In  general,  for  ( Pa,Pb )  ^  [0,  |]2 

lim sup  7Tn  <  7T0  <  i r*.  (3.8) 

n^oo 


Thus,  robust  optimization  against  uncertainties  never  leads  to  the  optimal  performance 
regardless  of  the  number  of  messages  exchanged. 

The  failure  to  converge  is  due  to  an  excess  of  caution.  At  step  n,  the  relays  try  to 
maximize  the  worst-case  success  probability  over  the  possible  choices  of  the  other  relay, 
based  on  what  the  other  relay  night  know  at  that  time.  Even  if  the  nodes  exchange  n 
messages  successfully,  the  possibility  that  a  message  gets  lost  suffices  to  prevent  the  nodes 
from  making  optimal  decisions.  Figure  3.3  summarizes  the  results  of  this  section. 

More  generally,  the  node  may  have  only  imprecise  Level-0  knowledge.  For  example,  node 
A  knows  p  belongs  to  a  set  Z,  or  Ka( 0)  ={pG  Z},  Z  C  [0,  l]2.  The  imprecise  knowledge 
situation  is  widespread  because  of  imperfection  of  observation  or  estimation  on  state  of  the 
nature.  As  a  result,  the  nodes  know  only  a  rough  range  containing  the  true  state. 
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Figure  3.3.  Evolution  of  the  network  performance  (a)  p  G  [0,  |]2  (b)  p  G  (|,  l]2 


Lemma  2.  Define  Z  :=  [p'A,pA)  x  \p’B ,  p"B] .  The  reliable  solution  for  a  at  Level  0  is 


' 

a0(Z)  =  < 


1, 

1, 


for  Pb  <  2 

for  p"B{2p"A  -  1)  <  {p"A  -  p'A) 


Pb 

PA+PilVi-l)’ 


otherwise. 


See  Appendix  for  proof.  An  interesting  special  case  is  when  A  has  no  clue  about  the 
exact  link  states.  That  is,  Z  =  [0,  l]2.  The  result  suggests  a0(Z)  =  1,  or  to  forward  always. 
This  general  solution  does  not  change  the  result  of  failure  of  convergence. 


3.4  Knowledge  with  Message  Loss 

In  the  previous  section,  we  implicitly  assumed  the  nodes  A  and  B  synchronously  increase 
their  knowledge  level  despite  the  possibility  that  messages  might  get  lost.  Figure  3.4  illus¬ 
trates  a  more  realistic  scenario:  At  some  time  to,  nodes  A  and  B  have  knowledge  Ka (n)  and 
KB{n)  respectively.  At  time  to  both  nodes  send  messages  to  each  other.  Node  A’s  message 
gets  lost  and  node  B  does  not  receive  it,  whereas  node  B) s  message  reaches  node  A.  At 
time  fi,  node  A  reaches  an  additional  level  of  knowledge  while  node  BA  knowledge  does 
not  change.  At  the  next  time  step,  node  BA  knowledge  level  jumps  by  2,  to  level  n  +  2, 
whereas  node  A’s  knowledge  does  not  change  and  remains  at  level  n  +  1,  and  so  on.  Note 
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Figure  3.4.  Assumption  of  synchronous  message  exchange  is  relaxed:  Knowledge  evolution 
when  a  message  from  A  to  B  is  lost  at  time  to- 


that  the  nodes’  knowledge  level  may  lose  synchronization  when  a  message  is  lost.  However, 
their  knowledge  level  gap  is  never  more  than  one  because  one  node’s  next  knowledge  level 
depends  on  the  other  node’s  current  knowledge  level. 

Thus,  at  any  given  time  t,  the  network  performance  can  be  either 


n(an+1,bn)  or  ir(an,bn)  or  7r(an,6n+i). 


Consequently,  in  addition  to  the  lossless  case,  it  suffices  to  consider  7r(ai,&o)  and  vr(a0,6i). 
From  (3.3.1)  and  (3.6),  one  finds 


tt(oi,  bo) 


Pa+Pb-  2paPb,  if  Pb<\ 
f,  otherwise. 


By  symmetry,  7r(ao,  b\)  is  similarly  found. 

This  completes  the  claim  that  the  relays  cannot  reach  optimal  coordination  via  lossy 
message  exchange  no  matter  how  high  a  knowledge  level  they  obtain. 


We  conclude  this  section  with  a  summarizing  fact. 


Fact  1.  A  distributed  system  cannot  reach  the  optimal  coordination  by  building  higher  level 
of  bounded  knowledge  via  lossy  message  exchange. 


3.5  Achieving  Optimal  Coordination 

The  previous  section  shows  that  the  lack  of  certainty  in  message  delivery  prevents  the 
nodes  from  coordinating  their  actions  optimaly.  This  failure  of  optimal  coordination  per¬ 
sists  regardless  of  the  level  of  knowledge.  This  observation  is  similar  to  the  conclusions  of 
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the  electronic  mail  game  [36]  which  led  many  researchers  to  study  the  continuity  of  belief 
structure.  Hopefully,  the  model  of  this  paper  shows  the  relevance  of  such  considerations  to 
network  protocols. 

The  practical  question  is  to  how  find  a  mechanism  that  achieves  the  optimal  coordination 
based  on  local  knowledge  Ka(ji )  or  Kb(u)  among  distributed  agents.  The  preceding  analysis 
shows  that  the  set  of  parameters  of  the  system  determines  the  evolution  of  the  performance  as 
n  increases.  For  instance,  the  analysis  shows  that  if  p  :=  ( Pa,Pb )  £  [0,  |]2,  then  an  —  bn  —  1 
is  optimal  for  n  >  0.  Also,  once  node  A  knows  Ka(ji )  for  some  n  >  1,  it  knows  that  n  =  n* 
regardless  of  the  level  of  knowledge  that  node  B  has  reached.  The  same  is  true  for  node 
B.  That  is,  if  p  G  [0,  |]2,  the  nodes  know  that  Level- 1  knowledge  suffices  to  enable  optimal 
decisions. 

However,  the  solution  does  not  converge  for  p  ^  [0,  |]2.  In  that  case,  the  nodes  know  that 
basing  their  decisions  on  Levcl-n  knowledge  does  not  lead  to  optimal  coordination.  With 
this  observation,  they  can  choose  to  deviate  from  the  myopic  max-min  strategy  and  follow 
instead  the  following  mechanism: 


OPTIMAL  COORDINATION-ACHIEVING  SCHEME 

Upon  receiving  message  Kb( 0)  from  B,  A  updates  its  knowledge  and 
obtains  AA(1).  However  node  A  does  not  send  Ka{  1)  to  node  B.  In¬ 
stead,  node  A  keeps  sending  Ka( 0)  to  node  B.  Similarly,  node  B  sends 
Kb{ 0)  to  node  A,  even  though  node  B  actually  knows  Kb{  1). 

Once  node  A  obtains  AA(1),  it  learns  if  p  e  [0,  |]2.  In  that  case,  the  relaying  solution 
based  on  AA(1)  is  known  to  be  optimal.  If  p  ^  [0,  |]2,  node  A  knows  that  gaining  a  higher 
level  of  knowledge  results  in  oscillations  that  can  in  turn  yield  a  zero  probability  of  success. 
Accordingly,  node  A  does  not  follow  the  myopic  max-min  algorithm  based  on  an  additional 
level  of  knowledge.  Instead,  it  assumes  that  ALt(l)  is  the  global  information.  At  that  time, 
although  node  B  still  has  knowledge  Kb(0),  the  network  performance  is  guaranteed  to  be  at 
least  half  the  optimal  level,  because  of  the  Level-0  result.  If  node  B  has  knowledge  A#(l), 
then  both  nodes  A  and  B  agree  to  the  optimal  coordination.  Since  the  message  is  lossy,  node 
A  keeps  sending  Ka(0),  so  that  node  B  eventually  reaches  knowledge  A's(l)  with  probability 
one. 

This  is  a  strategy  combining  pessimism  and  optimism  with  restriction  on  knowledge 
propagation.  At  level  0  knowledge,  one  plays  a  robust  strategy  (pessimism).  At  level  1, 
one  keep  sending  level  0  knowledge  to  each  other,  although  she  has  built  level  1  knowledge, 
and  pretends  to  have  common  knowledge.  That  is,  the  relays  throw  caution  to  the  wind  by 
restricting  knowledge  propagation  (KR)  and  hope  for  the  best. 

Let  Cn  be  the  event  that  two  relays  reach  the  consensus  about  the  complete  network 
parameters  within  nth  round  of  message  exchange.  The  probability  P(Cn )  of  the  event  Cn 
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is  given  by 


P{Cn)  =  1  -  (1  -  (1  -  e)2)n  «  1  -  (2e)n, 

where  e  is  the  small  probability  that  a  message  may  get  lost  between  two  relays.  This 
probability  approaches  to  1  at  exponential  rate.  Also,  using  this  modified  protocol,  the 
two  relays  choose  the  probabilities  a*  and  b*  when  the  event  C  occurs.  Let  ttrr (n)  be  the 
performance  of  the  protocol  after  the  relays  have  sent  n  messages.  We  find  that 

lT*P(Cn)  <  7 TKn(n)  <  TT* 
and  7 T*P(Cn)  — > ►  7T*,  as  n  — >  oo.  Therefore 

lim  7 TKR(ri)  =  7 r*.  (3.9) 

n— >  oo 

It  is  worth  mentioning  some  differences  with  the  Electronic  Mail  Game  result  where  no 
finite  sequence  of  message  exchanges  can  result  in  optimal  coordination.  First,  in  the  current 
problem,  the  payoff  is  defined  as  the  max-min  performance  rather  than  the  von  Neumann- 
Morgenstern  form.  Second,  there  is  no  negative  payoff  biasing  the  players’  decision.  Third, 
the  different  message  exchange  protocol  is  pivotal  because  it  does  not  assume  an  automatic 
acknowledgment  that  is  one  of  the  main  causes  making  the  coordination  impossible  as  pointed 
out  in  [11], 


Fact  2.  A  distributed  system  with  lossy  message  exchange  can  asymptotically  reach  the  op¬ 
timal  coordination  by  restricting  the  information  propagation. 


3.6  Throughput 

We  analyze  the  throughput  performance  of  2-stage  protocols  and  3  stage  protocols  such 
as  repetition  coding  and  time  division  multiplexing  (TDM)  that  we  describe  next.  The 
throughput  is  defined  as  average  rate  of  successful  deliveries  per  unit  time. 

Time  cost  plays  a  critical  role.  If  learning  takes  negligible  time  or  no  further  learning  is 
needed  after  initial  learning,  then  throughput  of  the  scheme  with  learning  may  outperform 
that  of  oblivious  one.  However  if  the  scheme  requires  continuous  learning  at  non-negligiblc 
time  cost,  then  we  will  see  some  oblivious  scheme  in  fact  outperforms  the  one  with  learning. 

The  first  3- stage  protocol  is  repetition  coding.  In  this  protocol,  the  relays  transmit  twice 
with  the  probabilities  ao  and  bo,  respectively.  The  throughput  of  this  protocol  is 

^  REP  =  1  —  (1  —  7T0)2  =  27T0  —  7Tg . 

Since  7T0  <  |,  ttrep  <  §•  Considering  the  fact  that  KR  may  reach  nKR  =  1  depending  on 
the  value  of  p,  the  repetition  coding  may  not  be  close  to  optimal.  Note  that  the  throughput 
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of  repetition  coding  is  always  better  than  or  equal  to  the  throughput  of  the  2-stage  oblivious 
protocol.  That  is,  define  T0  =  vr0/2  and  Trep  =  Trep/3.  Since  6 (Trep  —  To)  =  47To  —  27Tq  — 
3vr0  =  7r0(l  -  27t0)  >  0, 

Tq  <  Trep ■  (3.10) 

The  second  3-stage  protocol  utilizes  time  division  multiplexing.  In  this  protocol,  the 
relays  take  turns  to  forward.  Since  there  is  no  collision,  both  relays  forward  with  probability 
1.  The  corresponding  probability  of  success  is 

ktdm  =  1  ~  (1  —  Pa)(1  —  Pb) 

=  Pa  +  Pb  ~  PaPb- 

Now  let  us  consider  when  learning  time  cost  is  negligible.  Indeed,  when  p  G  [0,  |]2, 
it*  =  7Tq  and  thus  T*  =  T0.  Therefore 

T*  <  ma x(Trep,  Ttdm),  when  p  G  [0, 1/2]2, 

otherwise  T*  can  be  greater  than  ma x(TREp,TTDM). 

Let  us  consider  when  learning  time  cost  is  non-negligiblc.  Suppose  learning  is  required 
per  source  packet.  For  convenience,  assume  that  knowledge  exchange  takes  one  time  unit 
until  the  relays  reach  the  optimal  decisions.  That  is,  assume  ttkr  =  tt*  at  the  cost  of  time 
unit. 

Linder  this  assumption,  since  both  KR  and  TDM  are  3-stage  protocols,  their  throughput 
is  proportional  to  the  probability  of  successful  delivery  and  one  finds 


7T kr  <  n*  <  7 tTdm, 

The  left  inequality  is  immediate.  For  the  right  inequality,  we  consider  two  cases.  If  p  G  [0,  |]2, 
7T*  =  PA  +  Pb  ~  ^paPb  =  tttdm  -  PaPb  <  Mdm-  Otherwise,  n*  =  ma x(pA,pB)  <  ma x(pA  + 
pB(l  ~Pa),Pb)  <  max(pA  +  pB(l -pA),pB  +pA(l  ~  Pb))  =  Pa  + Pb  ~  PaPb  =  Thus, 

even  with  an  idealized  learning  time  cost  (one  unit  time)  and  performance  {ttkr  —  7T*),  an 
oblivious  scheme  that  requires  no  learning  outperforms  the  ideal  scheme  requiring  learning. 

Trr  <  Ttdm. 


3.7  Common  Knowledge  about  Inconsistent  Beliefs 

Information  is  not  knowledge  but  belief  when  it  does  not  guarantee  the  inclusion  of  the 
true  state.  Suppose  node  i  has  a  belief  about  p’s  possible  values:  Bi  {p  G  Z,}.  A 
different  node  may  have  a  different  belief.  One  key  observation  is  however,  i  does  think  Bi 
as  a  knowledge  rather  than  a  belief  since  otherwise  it  would  modify  Zt  to  make  it  include 
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broader  values.  By  exchanging  Bi:  distributed  nodes  can  build  common  knowledge  about 
beliefs.  Unless  Z%  and  Zj  conflict,  they  cannot  distinguish  knowledge  from  belief.  We  say 
they  reach  the  common  knowledge  state  about  consistent  beliefs,  or  simply  a  common  belief. 
When  they  discover  Z*  and  Zj  conflict,  we  say  they  reach  the  common  knowledge  state  about 
inconsistent  beliefs.  It  is  of  a  practical  challenge  to  make  a  strategic  decision  in  a  coordination 
game  while  players  have  common  knowledge  about  inconsistent  beliefs.  [4]  explained  that 
the  distributed  players  with  the  same  prior  cannot  agree  to  disagree.  We  study  the  game 
where  the  prior  is  not  defined. 

This  situation  frequently  occurs  in  many  practical  games.  For  an  example,  consider  a 
double  tennis  match  game  in  which  two  players  see  each  other  and  need  to  decide  who 
returns  a  ball.  Due  to  different  experience,  two  may  have  inconsistent  views  on  the  game. 
Further  they  know  it  as  a  common  knowledge.  It  is  likely  that  they  try  to  hit  or  leave  the 
ball  simultaneously,  failing  to  coordinate. 

Similarly,  due  to  imperfection  or  randomness  of  observation,  two  relays  in  relaying  net¬ 
work  may  obtain  different  beliefs  about  the  network  state.  After  information  exchange 
step,  they  build  common  knowledge  about  inconsistent  beliefs.  Ka  =  {p  €  Za,Kb}  and 
Kr  =  {p  £  ZB,  Ka}- 

Upon  facing  inconsistency,  an  issue  about  trust  arises  -  trust  about  information  but  not 
about  the  intent  of  the  information  source.  A  trust  is  a  rneta  information  of  the  coordination 
game  how  the  distributed  information  should  be  interpretated.  It  is  seldom  explicitly  stated 
in  the  game  description.  When  the  way  of  trust  is  specified  however,  it  helps  the  nodes  to 
reach  the  conciliation,  if  required,  from  inconsistency. 

In  a  coordination  game,  it  is  obvious  that  the  nodes  with  common  belief  cannot  out¬ 
perform  those  with  common  knowledge.  However,  it  is  not  clear  if  consistent  belief  will  be 
always  better  than  inconsistent  belief. 


3.7.1  Distrust 

It  is  called  distrust  (in  others)  when  the  node  trusts  only  its  belief.  Then  nodes  will  fail 
to  reach  the  coordination.  Suppose  Za  =  {(0.9,  0.2)}  and  ZB  =  {(0.2,  0.9)}.  Under  distrust, 
knowing  B's  best  response,  A’s  best  response  is  a  =  1.  Similarly  B's  best  response  is  b  =  1. 
As  a  result,  both  know  (a,  b )  =  (1, 1)  will  be  played.  Note  that  the  true  network  state  p 
can  be  neither  of  Za  nor  ZB,  but  can  be  something  else.  Interestingly,  it  is  not  the  case  the 
incoordination  from  distrust  always  underperforms;  depending  on  the  true  network  state  p, 
the  failure  of  coordination  may  prove  to  be  good. 

An  example  may  suffice  to  convince  readers.  Let  ptrue  =  (0.4,  0.4),  Z  =  Za  =  {(0.9,  0.2)} 
and  ZB  =  {(0.2,  0.9)}.  The  coordination  solution  is  (a,  b )  =  (1,  0)  based  on  Z,  whose  choice  is 
to  be  explained  shortly.  Then  the  coordination  performance  based  on  Z  is  7r(a*(Z),  b*(Z))  = 
0.4  while  the  incoordination  performance  is  n (a* (Z a) ,  b* (Z B))  =  0.48. 

If  the  game  is  such  that  the  price  of  the  failure  of  coordination  is  significant  however, 
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players  may  elect  to  rely  on  an  external  conciliation  rule.  This  rule  should  be  performance 
ignorant;  since  they  cannot  agree  on  the  range  of  true  network  state,  there  is  no  common 
measure  to  compare  the  performance  of  one  rule  to  other.  A  simplest  way  is  to  adopt  a  single 
belief  from  the  node  whose  lexicographical  order  is  the  highest.  Then  each  node’s  decision 
will  be  based  on  that  single  belief. 


3.7.2  Partial  Trust 

In  some  situation  the  nature  of  the  game  suggests  that  a  node  give  up  some  trust  in  its 
initial  belief  and  take  some  beliefs  from  others.  The  partial  trust  may  arise  in  various  forms. 
We  discuss  a  few  cases:  Locality  trust,  Meet  type  trust  and  Join  type  trust. 

Here  the  trust  is  a  way  of  constructing  a  new  common  belief  from  the  distributed  and 
inconsistent  information.  The  choice  of  a  trust  form  should  be  mandated  across  the  players 
at  the  time  of  the  game  design,  if  any  conciliation  is  to  be  needed. 

Define  zf  to  be  the  set  of  possible  values  for  pk  that  node  i  initially  believes.  z\  is  a  belief 
about  its  local  link  and  z^,ky^i  about  its  foreign  link.  Then 

Z,:=Y14. 

k 


Locality  trust 

It  is  possible  that  link  i  can  be  best  known  to  node  i.  That  is,  each  node  has  trust  in 
everyone’s  local  link  belief  but  not  foreign  link.  In  this  case,  the  node  with  common  knowl¬ 
edge  about  inconsistent  beliefs  are  willing  to  agree  on  a  common  belief  that  is  constructed 
with  most  trusting  elements. 

Z Local  '■=  ]^[  z\ 
i 


Meet  type  trust 

In  Meet  type  trust,  the  node  accepts  other’s  information  too  and  construct  a  new  common 
belief  as  the  smallest  set  containing  all  beliefs. 

zm«<~  ihr 

k  i 
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Join  type  trust 

In  Join  type  trust,  the  node  constructs  a  new  common  belief  as  the  intersection  of  all 
beliefs. 

Zjoin-.=  not 

k  i 

Note  that  Zjoin  can  be  degenerate  when  fj-  zf  =  0  for  some  k. 

Let’s  restrict  ourselves  to  the  case  where  the  result  of  construction  indeed  constitutes  a 
knowledge.  Let  Z&  =  [0.2,  0.9]  x  {0.2}  and  ZB  =  {0.2}  x  [0.2, 0.9]  with  ptrUe  =  (0.4,  0.4). 
Then  ZLocai  =  [0.2,  0.9]2  =  ZMeet.  A  simple  exercise  shows  that  the  distrust  solution  is 
(a,  b)  =  (1,1)  and  a  coordinated  solution  with  local/meet  type  conciliation  is  (a,  b)  =  (1,0). 
At  ptme,  the  distrust  solution  outperforms  coordinated  solution  after  conciliation. 

We  conclude  this  section  with  the  a  summarizing  fact. 


Fact  3.  Neither  belief  consistency  nor  a  common  knowledge  about  beliefs  is  sufficient  to 
achieve  an  optimal  coordination  in  a  distributed  system. 


3.8  Conclusion 

The  distributed  system  with  a  common  goal  often  faces  the  issue  of  independent  deci¬ 
sion  with  limited  knowledge  where  the  price  of  coordination  failure  can  be  significant.  In 
this  chapter  we  focused  to  understand  the  impact  of  information  uncertainties.  In  particu¬ 
lar  we  studied  bounded  knowledge  about  other’s  knowledge  and  common  knowledge  about 
inconsistent  beliefs. 

To  make  the  problem  down-to-earth,  we  adopted  a  simple  butterfly  relaying  network 
which  has  been  a  popular  platform  in  communication  networking  area.  In  the  first  problem, 
we  showed  that  two  relay  nodes  with  bounded  level  of  knowledge  about  other’s  knowledge 
cannot  reach  the  state  of  global  coordination  regardless  the  depth  of  the  level.  However, 
we  also  provided  a  scheme  in  which  the  network  can  asymptotically  achieve  the  optimal 
coordination  via  the  intentional  restriction  of  knowledge  propagation.  Finally  we  showed 
that  belief  consistency  or  the  state  of  common  knowledge  about  beliefs  in  general  is  not  a 
sufficient  condition  for  optimal  coordination. 
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3.9  Proofs 


3.9.1  Proof  of  Lemma  2 


a0  =  argmax  min  n (a,  b ). 

a  b,p£Z 

with  Z  :=  [p'a-iPa]  x  WriP’r]  ■  A  key  observation  is  that  inside  minimization  problem,  nature 
and  the  node  B  jointly  choose  either  pBb  =  0  or  pBb  =  p"B.  Then  for  a  fixed  a, 

min  vr(a,  b )  =  min  (pAa,  pAa(l  -  2 p"B)  +  p"B). 

b,P&Z  PA^WjvVa] 


When  drawn  with  respect  to  a,  the  result  of  minimization  is  the  lower  boundary  of  the  area 
spanned  by  two  linear  curves  pAa  and  pAa(l  —  2 pB)  +p"B  for  all  feasible  values  of  pA  and  pB. 
Oo  is  found  where  this  lower  boundary  is  maximized. 

When  1  —  2 pB  >  0,  both  curves  are  increasing  in  a.  Thus  ao  =  1.  When  1  —  2 pB  <  0, 
the  lower  boundary  is  decided  by  two  curves  p'Aa  and  pAa(  1  —  2 p"B)  +  p"B.  If  the  former  is 
not  greater  than  the  latter  for  a  E  [0, 1],  then  the  lower  boundary  is  maximized  at  Oo  =  1. 

p" 

Otherwise,  their  intersection  is  the  maximum.  Thus  a o  =  p,  +pn  ^2p"  _i)- 
Another  simpler  case  is  when  KA{ 0)  =  {pA  E  \p'A,pA}}- 


a  o 


1, 

i 

p'a+p'a  ’ 


for  p'A  <  1  -  p"A 
otherwise. 


One  way  to  view  this  result  is  to  think  p'A  as  the  base  state  and  p"A  —  p'A  >  0  as  its 
maximum  departure.  One  can  see  that  ao  with  the  imprecise  knowledge  is  always  less  than 
ao  with  the  precise  knowledge  where  p"A  —  p’A.  In  this  view,  the  node  should  forward  more 
cautiously  when  its  knowledge  is  less  precise.  (The  interpretation  is  the  other  way  when  p"A 
is  regarded  as  the  base  state.)  Irrespectively,  the  guaranteed  performance  is  always  lower  in 
imprecise  knowledge. 

When  this  knowledge  is  built  up  in  a  higher  level,  the  solution  and  the  performance  do 
not  converge  in  general. 
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Figure  3.5.  Feasible  region  of  n  is  shaded  when  1  —p"A  <  p'A  <  with  respect  to  a.  Minimum 
is  the  lower  boundary.  Its  maximum  is  obtained  at  Oq 
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Chapter  4 


Conclusion  and  Future  Work 


4.1  Conclusion 

Game  theory  with  incomplete  information  usually  assumes  preplay  knowledge  about  un¬ 
certainty  in  the  form  of  a  probability  distribution  of  unknown  parameters.  This  dissertation 
explores  games  where  no  such  distribution  is  assumed,  only  the  knowledge  of  the  support  of 
the  unknown  parameters. 

This  dissertation  studies  one-shot  two-agent  non-cooperative  and  cooperative  games. 
For  the  non-cooperative  games,  we  define  consistent  sets  as  a  product  space  of  rational 
strategies  under  uncertainty,  and  then  introduce  the  optimism  -  pessimism  attitude  as  an 
additional  degree  of  strategy  for  the  players.  Corresponding  to  given  attitudes,  we  define 
consistent  sets  of  strategies  from  which  rational  players  should  not  depart.  We  then  consider 
a  two-stage  game  where  the  players  first  strategically  choose  their  attitude  to  maximize  the 
ex-post  utilities  they  receive  after  the  second  stage  where  they  play  the  game  with  known 
attitudes.  This  formulation  sometimes  results  in  a  specific  strategy  such  as  being  optimistic 
or  pessimistic.  In  such  cases,  the  agents  may  have  a  uniquely  specified  strategy  despite  the 
uncertainty. 

Next,  we  study  a  cooperative  game  where  relay  nodes  collaborate  to  maximize  the  prob¬ 
ability  of  successful  delivery  of  a  packet  in  a  wireless  network.  For  this  model,  the  nodes 
exchange  error  prone  link  state  messages  to  inform  each  other  of  their  link  characteristics. 
We  show  that,  in  this  model,  the  nodes  should  not  be  overly  cautious  in  trying  to  protect  the 
throughput  against  the  failure  of  delivery  of  a  link  state  message.  If  they  are,  the  throughput 
does  not  converge  to  a  high  value  as  the  number  of  link  state  messages  increases.  We  also 
show  that  a  more  optimistic  protocol  that  does  not  consider  the  worst  case  behavior  of  the 
other  node  has  a  throughput  that  converges  to  the  maximum  possible  value. 


53 


4.2  Future  Work 


In  this  thesis,  we  considered  that  agents  choose  their  attitude  in  the  face  of  uncertainty. 
We  parametrized  this  attitude  by  a  degree  of  optimism  and  then  considered  a  two-stage 
game.  Other  schemes  are  conceivable  for  describing  a  player’s  attitude.  Desirable  schemes 
should  be  able  to  describe  exhaustively  the  consistent  sets.  Its  rational  equilibrium  should 
exist  under  a  wide  class  of  utility  structures.  Also,  there  should  be  a  close  connection  between 
that  equilibrium  and  the  Nash  equilibrium  under  full  information,  when  the  uncertainties 
decrease. 

The  relationship  between  Nash  equilibria  under  full  information  and  the  uncertainty  equi¬ 
libria  of  attitudes  deserves  some  further  exploration.  We  imagine  that,  as  the  uncertainty 
grows,  multiple  possible  distinct  Nash  equilibria  may  merge  into  a  single  uncertainty  equi¬ 
librium.  Shelten’s  trembling  hand  selection  theory  avoided  this  puzzle  by  considering  only 
very  small  uncertainties. 

This  dissertation  provides  a  novel  computation  methodology  for  two-agent  problems,  but 
has  not  explored  more  general  n- agent  problems.  The  expansion  of  the  methodology  should 
be  studied,  with  careful  understanding  about  existence,  convergence,  uniqueness  and  beyond. 

Finally,  the  validation  of  this  methodology  through  behavioral  game  theory  should  be 
studied. 
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